Tuesday, December 16, 2008

The "ESH"

For people less acquainted with the Integration world, the word Bus in "Enterprise Service Bus" causes many to believe that an ESB is something distributed. But on the contrary, 95% of ESB deployments are hub and spoke. One or a few servers located centrally through which all the messages pass. Distributed execution of integration logic remains the exception.

Therefore, we should maybe introduce the "ESH", the Enterprise Service Hub?

Notes:
  1. Older integration solutions often had their adapters running on the same servers as the back-end applications or database, so away from the central message broker. But nowadays, also all the adapter logic is put in to the central hub.
  2. Obviously, every ESB can deployed in a distributed manner, interconnected by some messaging solution. But that's definitely not the standard approach.
  3. Why is integration logic put centrally? One justification is to avoid disturbing the servers on which the back-end applications are running.
  4. Maybe lighter-weight, open source ESB's will make distributed execution of integration logic more popular. E.g. with such open source ESB's deployed along with a J2EE application.

Sunday, December 14, 2008

Devoxx 2008

Last week I was at the Devoxx conference in Antwerp. Just 20 kms from where I live and the city where I grew up. Being one of the steering members, I gave a hand here and there and was involved in selecting the talks, in particular regarding SOA and security. I really enjoyed the conference, some of my highlights:
  • Logo: simply love it
  • Venue: the Metropolis movie theater is a really nice location and the the logo on these big screens looks soooo nice; the seats are just a bit too comfortable: my eyes seem to close automagically
  • The team: I really had fun times this week with Frederik, Sven, ValĂ©rie, Jo, Stijn, Stephan, Gert, Dan, ...
  • DataPower: the IBM partner talk had obviously some commercial aspect, but some insight on XML threats and the idea of an 'ESB in hardware' were simply awesome
  • Paul Fremantle's talk on complex event processing and the conversation afterwards, e.g. on AMQP and the "unreliability of WS-ReliableMessaging" (Paul is the WS-RX spec lead)
  • XSLT 2.0 by Doug Tidwell: XML remains relevant and Dough can bring his story in such a funny way (thanks Robin for arranging this)
  • REST talk by Stefan Tilkov: although I have a more biased view on the REST and WS-* story, Stefan brings his message so well
And so much more: JAX-RS talk, OpenMQ, XML Persistence by John Davies and meeting Mr Ivar Jacobson at the Devoxx reception desk.

Already looking forward to Devoxx 2009! And thinking about new topics and speakers in the SOA/security area for 2009: Smooks, more cloud computing, new ESB features, BPM and BPEL (BPELScript?), design- and runtime governance of services, Master Data Management, new XML stuff, claims based security, trusted computing, ... Any suggestions?

Sunday, November 23, 2008

Microsoft ESB and WS mediation

In this post, I'll cover the 2nd question that I raised in my previous post: "where to do transformations (and routing, monitoring, ...) of the web service interactions?" When a consumer and service use different service defintions, how to transform between both in a Microsoft world? Or re-phrased: where is the Microsoft EBS to mediate synchronous, request/response web services?

When searching on "Microsoft ESB", one quickly ends up at Enterprise Service Bus Guidance. This is not a real product but documentation and components developed by Microsoft's Architecture team. As BizTalk is a hub-and-spoke integration solution that persists every incoming message in it's MessageStore, BizTalk isn't really suitable to be an intermediary for synchronous, request/response service invocations. As such, BizTalk lacks crucial features to be called an ESB, although it remains a nice integration solution.

With no Microsoft ESB available, what are the options?
  • Java based ESB: commercial (WebSphereESB, AquaLogic/OracleESB, Tibco BusinessWorks/ActiveMatrix, SoftwareAG, JCAPS/OpenESB) or open source (Synapse, WSO2 ESB)
  • Runtime governance tools such as Amberpoint that are also capable of doing transformations. Microsoft has SOA Governance integration with Amberpoint and SOA Software.
  • XML appliances

But none of these options are really appealing to the average Microsoft shop. If there is full access to the .net source code at one side (consumer or provider), some custom transformation logic can be added. But there doesn't seem to be any clear hooks for transformation in WCF.

Note: Microsoft's new cloud computing - Azure - specifies an Enterprise Service Bus in its .Net Services. The preliminary documentation of the Microsoft .Net Service Bus talks about naming, different types of RelayBindings and security. But transformation and routing of messages isn't covered (yet).

Saturday, November 22, 2008

Microsoft .Net and JMS?

Recently I was challenged by a customer with strong Microsoft focus that required integration with a newly accquired application based on Java/JEE. Both the Microsoft .Net and Java side supported Web Services. The service contracts - message formats - were obviously different. The Java side also leveraged JMS for asynchronous communication.

That brought up 2 very interesting qustions:
1. How to link a .Net application to JMS?
2. Where to transform (and route, monitor, ...) the web service interactions?
In this entry, I'll cover the JMS question. In a next posting, I'll discuss the Web Service mediation question.

To start, there is no out-of-the-box solution: no generic .Net component to talk to JMS, no generic MSMQ/JMS brige or no standard .Net version of the JMS API. Below a list of other options:

  • Most JMS providers come with some .Net (or COM) API, although all proprietary. E.g. IBM has WebSphere(R) MQ classes for .NET and XMS.Net. WCF bindings from JMS providers are hardly available: Tibco has announced one (is it available already?) and IBM has a prototype available.

  • Some JMS implementations expose a REST like interface, so simple interactions over HTTP. In case of WebSphereMQ, this is the MQ Brige for HTTP.
  • Microsoft BizTalk has a WebSphereMQ adapter already a long time and more recently, a TibcoEMS adapter is available as well. But BizTalk does not have a generic JMS adapter.

  • JNBridge is a company providing .Net/Java interoperability products (and earlier COM/Java interoperability). JNBridge has .Net JMS adapters: one for BizTalk and one for .Net.

  • And Host Integration Server has a MSMQ/MQSeries bridge.

The 1st or 2nd option have my preference. Although you're programming against a proprietary API, no BizTalk nor 3rd party software needed.

Note: using the JMS API with WebSphereMQ introduces an extra complexity because of the way JMS header fields are mapped to the MQ message structure. MQ uses the MQRFH2 header to store JMS specific properties. The Microsoft ESB Guidance comes with a a BizTalk Pipeline component that provides support for this MQRFH2 header.

Note: not 100% sure, but most probably the .Net or MSMQ adapters of Java based integration solutions such as WebSphereESB or JCAPS use JNBridge underneath.

Saturday, October 25, 2008

Amazon cloud computing goes fast

Right now, I have my own (virtual) server running in the Amazon data center. Getting such an Amazon server running has really become very easy. With Elasticfox, a plug in for Firefox, everything can be configured in a trivial and user friendly way. No more need to use command line tools or write your web service calls yourself. Just follow the Getting Starting Guide.

July of this year, I read the book "Programming Amazon Web Services" by James Murty. Great book, with lots of Ruby code explaining how to invoke the low level Amazon web services. The book was published in February 2008 and already a bit outdated during summer when I read it, but getting more and more behind. Amazon is implementing new features at such a rapid pace:
  • Public IP address (Elastic IP address), earlier one needed a computer elsewhere with fixed IP address to forward clients to server located at Amazon (e.g. through HTTP 302 or other)
  • Local, permanent file system (Elastic Block Store), earlier one needed to leverage S3
  • Lower prices
  • Windows support, before there were only *nix distributions available
  • Database support with Oracle on Linux and now SQL Server on Windows
  • No more beta but full production with SLA
  • Elasticfox plug-in along with good documentation
So now I have my own simple Windows 2003 server with a fixed IP address and DNS name. Accessing the server goes fine with Remote Desktop. The responsiveness is not always top, but similar to a local VMWare instance. By the way, this is a perfect alternative for VMWare and a serious competitor! I have the smallest server instance (AMI) running, which is obviously virtualised at Amazon. But it looks like dual-core Opteron with 1.66 GB of memory. And bandwidth is phenomenal: downloading Acrobat at more than 8 MByte/s.

Amazon is already announcing future features such as load balancing, monitoring and automatic scaling (automatically launching extra server instances). Strange that charging is still done via credit card. But I assume that big users can get a real invoice with payment terms.

Extra remarks:
  • On Friday Dec. 12, the Amazon evangelist Simone Brunozzi will give a talk at Devoxx conference.
  • Running the server instance during a couple of hours had a cost of 70 dollar cents, mostly because I left the elastic IP address unused for a while

Sunday, October 12, 2008

Simple messaging protocols

Messaging systems such as IBM's WebSphereMQ and similar use proprietary messaging protocols. So some library is always needed at the client side to talk the proprietary language to the messaging server. If such library is not available for your programming language, you're out of luck. Regarding standard API's, JMS seems the only one ever defined.

If a messaging server exposes a simple protocol over HTTP, it becomes possible to talk to the messaging server from any programming language. ActiveMQ is a good example in that area with their STOMP protocol. IBM has the "MQ Bridge for HTTP". And OpenMQ 4.3 now has the UMS protocol.

From quickly skimming over the REST versions of these protocols - UMS of OpenMQ in particular - they are not "pure REST" but rather "REST-RPC Hybrid" (cf. the great book "RESTful Web Services"): 1) HTTP POST is used instead of GET, PUT or DELETE, 2) the actual action is part of the URL parameters and 3) the interactions become stateful through Logon service request.

When I think about a "pure REST" approach for messaging, I expect to see URLs such as http://mq.my-org.be/.../domain/queue. Sending a message becomes a HTTP PUT action. Peeking a message is a HTTP GET action. And receiving a message should become HTTP GET followed by HTTP DELETE (receiving a message is normally "destructive" in the messaging world). How to avoid concurrency issues in this receive scenario with multiple clients receiving the same message is a REST concurrency question that I gladly pass on ;-)

Closing remarks:
  • Another approach is taken by AMQP: this initiative standardizes a binary protocol between client and messaging server. Any AMQP client library in whatever programming language should be able to communicate with any AMQP compliant server. Adoption of AMQP is rather limited.
  • Existing messaging products (WebSphereMQ, SonicMQ, ...) can tunnel their protocol over HTTP(S), but that still requires use of their respective client libraries.
  • Most JMS messaging solutions support .Net (and optionally COM).
  • At Devoxx 2008, Linda Schneider will talk about "Connectivity with OpenMQ" and Bruce Snyder does university session about "ActiveMQ and ServiceMix".

Wednesday, October 8, 2008

TCP/IP vulnerability?

Security Now is a great podcast about all sorts of security topics. Nr 164 is about "Sockstress". There seems to be a serious problem in almost any tcp/ip stack, including those of routers! Steve Gibson (the security person driving this podcast) based himself on a Dutch podcast called "De beveiligingsupdate" ("Security update").

Having some understanding of networking, but not being a specialist, it seems that this attack is launched after the 3-way tcp/ip handshake is done. After such handshake, a reasonable amount of trust has been created, as the server knows the ip address of the client. And implicitely it assumes that the client will behave according to the tcp/ip rules.

So this attack only starts after the tcp/ip connection has been established. First of all, the client reduces its resource consumpption by encoding information about the connection in the sequence numbers in the headers of the packets. As such, it needn't keep state. Secondly, the client doesn't use the TCP/IP stack of the client machine itself but has an implementation in user space, based on raw sockets. And then it starts playing dirty tricks by e.g. responding to the server that it doesn't have any buffer space left. The server will wait a certain amount of time and try to resume sending. By forcing the server to manage this large set of connections with all the resource consumption - memory and timers - the TCP/IP service goes through its knees. And potentially the complete OS crashes! This problem and corresponding attack seems to be known for 3 years, but only now is it coming out in the open.

Anyway, this is the way I understood it. After the DNS poisining issue, this seems a very fundamental attack. If this story is true, and no countermeasures are found, this might become an important issue. Not only crisis in the financial world, but also a crisis in Internet land.

Note: there is a related Dutch podcast called "Ict roddels" (ICT gossip), recommended to native Dutch speakers

Sunday, October 5, 2008

Password renewal in adapters

ESB's use adapters to connect to all sorts of systems: back-end applications, databases, queueing systems, (S)FTP(S) servers, Web Services, HTTP(S) servers or B2B counterparts. The ESB usually uses a technical user account to connect to these systems. Unless the real identity of a human user is carried along to the back end systems (identity propagation).

Larger organizations enforce password change policies. But changing the password with which such technical user connects to one of these other systems is a tough task. The password change in the target system and the ESB need to happen at the same time. And to avoid any problems or disturbing the business, this usually means late at night or in the middle of weekend (when the system goes down for scheduled maintenance).

It would be nice that adapters would provide support for such password changes. One option would be to pre-configure a new password and the datetime from which it should be applied. Another alternative is the configuration of 2 or 3 passwords. If the 'current' password doesn't work, try the other (newer) ones.

PS: similar problem is the changeover of encryption keys

Saturday, October 4, 2008

Oracle in the cloud

While stuck in Belgian traffic jams, I listen a lot to podcasts. One such podcast is the "Oracle Technology Network Techcasts". One of the latest podcasts - recorded at OracleWorld - was about Oracle and "the cloud". Interesting to learn that Oracle products will become available on Amazon's cloud computing infstructure. So Oracle will officially support deployments of its database on EC2. Oracle also makes available pre-configured Amazon Machine Images (AMI) containing the Oracle database.

But more interesting to me was the announcement that Oracle is also making available its Fusion middleware in the cloud. That should mean that it becomes possible to run Oracle's SOA suite, the Oracle BPEL engine or the Oracle B2B server in the cloud!

When checking out the list of AMI's that Oracle makes available, no Fusion middleware yet. Looking forward to get more detailed information about Oracle middleware in the cloud.

Final note: next to Linux, Amazon will also start providing Windows images (virtual machines)

Monday, September 29, 2008

FTP = HTTP GET ftp://...

Today I was in for a surprise: at a customer I was investigating how to reach an FTP server outside their firewall. From within my browser I could easily reach the FTP server with the URL
ftp://user:password@ftp.company.com, thereby going through the FTP/HTTP proxy.

But when I tried to do the same with FileZilla, the free FTP client, I couldn't access the FTP server. Ultimately I did some sniffing using WireShark. And to my surprise, the browser was talking HTTP tot the proxy, no FTP on the wire! I noticed the HTTP request "HTTP GET ftp://..." being sent to the proxy. And FTP listings coming back, formatted as HTML!

So learned a new thing: "FTP over HTTP" !

Saturday, September 13, 2008

DeVoxx conference

DeVoxx is the new name of the Javapolis conference. DeVoxx is the Java conference taking place in Antwerp, Belgium. The conference location is a large movie theatre, close to where I live. Last year, Javapolis, eh DeVoxx had 3000+ attendees.

I'm very pleased with the new logo and very curious to know what the T-shirts will look like!

PS: as a member of the DeVoxx steering committe, I try to find interesting speakers for the SOA track

Thursday, September 4, 2008

XQuery or XSLT?

An important part of every ESB is transformation. Most ESB's use an XML representation internally. Non-XML messages are first converted to an XML representation and vice-versa. The conversion between non-XML and XML representations can become a challenge (and transformation) in itself.

The main transformation logic in (most) ESB's thus becomes XML-2-XML. Most ESB's use XLST for transformation (BizTalk, IBM ESB, Tibco BW, Oracle SOA, ...). And most commercial integration tools still use XSLT 1.0.

But there is another XML standard technology that can be used for transformations: XQuery. The only 'big' player in the integration world supporting XQuery is BEA AquaLogic. Obviously, the ESB's such as Mule or WSO2 ESB provide XQuery support as wll.

Regardless of the big discussions on XSLT vs. XQuery, I think XQuery would be a very welcome addition in the toolset of integration developers. Adding an XQuery transformation component in the palette of building blocks can't be that hard.

The IDE support for XQuery is obviously more complex. Development of XML assets such as XML Schema's and XSLT transformation is better supported in 3rd party XML tools anyway.

PS: haven't used it yet, but the XQuery support in StylusStudio looks as good as their XSLT support

Friday, August 29, 2008

More SOA books

I'm always on the lookout for good books in the area of Integration, SOA, BPM and Web Services. I recently skimmed through a couple. None of them are medal winners, but some parts are worth the read.

Through my ACM membership, I have access to a limited list of books on books24x7. One of them was "Enterprise Architecture and Integration - Methods, Implementation, and Technologies". Each chapter is written by a different group of authors. The quality of the different chapters and authors varies strongly. The best part is the 1st chapter, written by Wing Lam and Venky Shankararaman, the editors themselves. It is really great! The authors give a great overview on EAI and its relationship with SOA/BPM. The following chapters by differents authors are (in my opinion) of lower quality. The 4th chapter was interesting again as it discussed SAP Netweaver and SAP XI.
If you have access to books24x7, go check out that 1st chapter. But don't spend your money on the book itself.

Via pdfchm.com, I stumbled upon "SOA Approach to Integration - XML, Web Services, ESB and BPEL in real-world SOA projects" by by Matjaz B. Juric, Ramesh Loganathan, Poornachandra Sarang and Frank Jennings. The combination of SOA and Integration in the title set my expectations high. The 1st and 2nd chapter are nice introductory material. But further down, the chapters don't go into much detail. E.g. the BPEL chapter is really about the BPEL XML syntax.

Another book available on pdfchm.com is "SOA and WS-BPEL" by Yuli Vasiliev. The title of the book should rather be "PHP and Web Services". The book is well written. Chapters 1 to 4 go into The 5th chapter goes into BPEL; The 6th and last chapter shows how to implement an example using ActiveBPEL. And although I don't know much about PHP, the books looks very interesting for PHP developers.

Tuesday, August 19, 2008

Queuing in the cloud?

A post on the eai-select newsgroup mentioned OnlineMQ as an alternative to Amazon SQS. As stated in earlier posts, I think there is room and there are opportunities for cloud based integration solutions. A hosted queuing solution is a first step.

Therefore, I looked a bit around at the OnlineMQ website. I created a free ("Silver") account and played a bit around. The service will officially launch in September. The paid subscription ("Gold") is 60$/connection, no mention is made about maximum number or volume of messages.

Maximum message size is 256 KB, which is reasonable. Interesting feature is the option to configure a number of fixed IP addresses that are allowed to reach the server. It is typical in B2B solutions to work with fixed IP addresses and add extra security by only allowing access from the fixed IP addresses of your business partners. Which obviously requires everyone to have fixed IP addresses.

OpenMQ talks about JMS but doesn't seem to support the JMS API (yet). However, OpenMQ does support REST, POX (Plain Old XML) and SOAP interfaces. For the SOAP interface, it uses the (non WS-I compliant) rpc/encoding. And security is not based on WS-Security.

It would be interesting to know more about the company or individuals behind this initiative. Even more as the company seems to be located nearby in the Netherlands. The online agreement refers to UK law being applicable, so that points to a British initiative. From a reverse DNS lookup, I learn that the IP address is owned by Level3. So the servers are located at a Level3 facility.

Friday, August 8, 2008

AMQP enthusiasm?

All existing messaging solutions (WebSphereMQ, JMS, ...) use proprietary protocols. This is not a problem within a single organization. But between organizations, standard protocols are needed. Therefore, the B2B world uses protocols such as AS2, RNIF (RosettaNet) or good old (S)FTP(S).

AMQP is an initiative to bring a standard binary wire protocol to the messaging world. Just like POP3+SMTP allows you to retrieve and send emails using whatever email server, AMQP will allow any AMQP client to receive and send messages via any AMQP compliant server.

But when I read the spec, AMQP is focusing on the client-server protocol, contrary to SMTP that is (also) used for communication between mail servers. The AMQP spec states that a bridge should be used for server-2-server communication, but doesn't provide any details. As such, AMQP is focusing on messaging within the corporate firewall.

AMQP can be used for unbalanced B2B scenario's, where one side runs the AMQP broker. This is a setup similar to one big company or intermediary running an (S)FTP(S) server and smaller organizations putting and retrieving files from it. But for good decoupling, server-to-server communication is preferred. The server at the sending side will take care of delivering the message to the server at the opposite side. Like e.g. AS2 does: once an organization has an AS2 server in place, it becomes equal to all its AS2 counterparts.

With all this in mind, I was a bit puzzled by Paul Fremantle's enthusiasm about AMQP. In particular because he is the WS-RM spec lead.

WS-ReliableMessaging should have brought reliable async messaging to the WS-world. But it didn't. The WS-RM spec doesn't mention message persistence and so (most) vendors have an in-memory implementation, which is not reliable.

I still remember going through the book "Programming Indigo" and learning about the ReliableSessionEnabled binding property. What a disappointment to learn that for real reliability, one had to use the MsmqIntegration Binding and thus the proprietary MSMQ transport layer.

Monday, August 4, 2008

Amazon Web Services - Book review

While enjoying holidays, I read the book "Programming Amazon Web Services" by James Murty. As explained in my earlier post, I was most interested to learn how cloud computing could be leveraged for developing integration solutions.

The book discusses 5 Amazon Web Services (AWS):
  • Simple Storage Service (S3)
  • Elastic Cloud Computing (EC2), virtual Linux servers on demand
  • Simple Queue Service (SQS), to deliver short messages
  • Flexible Payment Service
  • SimpleDB - simple database with no SQL support
The book goes into quite some technical detail and has code snippets showing in detail how to interact with the Amazon services. All the samples are written in Ruby. I don't know Ruby, but the code is quite readable (should read Enterprise Integration with Ruby some day). The author prefers the REST and the Query API. Unfortunately, he does not show anywhere the use of the SOAP API to access Amazon WS.

The 1st chapter is introductory and e.g. explains how to use self-signed certificates to connect with AWS, explains how AWS were developed for internal use by Amazon and later turned into a products, come without an SLA (except for S3) and without real support.

In the 2nd chapter, the author builds up a library of Ruby code to access the Amazon Web Services. This is very well written and gives an immediate feeling for some aspects to take into account, e.g. clock differences.

S3 is covered in chapters 3 and 4. No standard file access but the use of buckets and objects through a non-standard API (REST or SOAP); no FTP, WebDAV or SFTP. And objects cannot be modified: only deleted and re-created (after the deletion has propagated). Ruby code is shown for all the options the API offers: bucket creation/lookup/deletion, object creation/listing/deletion, ACL update/retrieval and access logging file retrieval. Tricks with HTTP header fields (object metadata), posting data through forms, alternative hostnames and BitTorrent are discussed. The last part discusses signed URI's: this is a neat trick to make S3 resources temporarily accessible to users without Amazon account.

Chapter 4 shows some applications of the S3 service: large file transfer, backup, turning S3 into a file system (with FTP or WebDAV). Interesting to note that the author has his doubts wrt. exposing S3 as a file system. The author also discusses his own Java open source application: JetS3t. This application is a "gatekeeper" for S3 resources and authorizes local agent applications after acquiring signed URL to upload files to S3 and download files from S3.

Chapter 5, 6 and 7 dive into EC2 and how virtual Linux systems (based on Xen) can be configured using Amazon Machine Images. Ruby code is shown for every available API: keypairs (for SSH access), network security (dynamically configure the firewall), images and instances. Chapter 6 explains instances in more detail and discusses how to create new images. This involves quite some commands and scripts at the Linux command prompt. Chapter 7 discusses some sample applications: VPN server, web photo album thereby backing up data on S3. Chapter 7 also discusses issues around dynamically assigned IP addresses and the use of dynamic DNS.

The Simple Queue Service (SQS) is discussed in chapters 8 and 9. Because of the small message size, SQS is clearly meant for events with actual data stored on S3 (or elsewhere). Again Ruby code to manipulate queues and messages. Chapter 9 describes a Messaging Simulator application, not that relevant in my opinion. The 2nd application - leveraging a video conversion tool - shows how to build generic service for implementing "batch" services (Command Message pattern). The 3rd application - LifeGuard - leverages SQS to manage EC2 instance pools and dynamically scale the number of EC2 instances.

The chapter on payment service I skipped and I only skimmed through the SimpleDB chapter. Enough to learn that SimpleDB is not an RDBMS but a basic storage mechanism (no data types) with proprietary query facilities (no SQL).

The author writes fluently and gives a non-biased view on the Amazon Web Services. Sometimes the code goes into too much detail, showing how to invoke every available method of the API. Although the book is very recent (March 2008), important new features such as elastic IP addresses, persistent storage for EC2 and availability zones weren't yet available at the time of writing. The book definitely taught me that AWS is quite proprietary and not that trivial. And to use Amazon's cloud computing and AWS, you'd better "think like Amazon".

Friday, August 1, 2008

Amazon Web Services

When the Amazon Simple Queue Service appeared about 2 years ago, I looked into it as a solution of message exchange between business partners. With its limited message size (only 4K) and message retrieval based on simple Amazon user accounts, I put the offering aside. The Amazon offering did however trigger me into wondering when the big Internet players (Google, Microsoft, Salesforce or Amazon) would enter the integration market.

But the 'big' players aren't entering the integration world (yet?). Alternatively, software vendors could cloud-enable their software. Or anyone could leverage the cloud and develop an integration system on system on top of it.

Some scenario's that I can envision, probably lacking some imagination here ;-)
  • upload messages to the cloud from which they can be polled and retrieved (e.g. some central FTP or JMS server)
  • service composition (something like Splice)
  • business processes managed in the cloud through BPEL process engine (like RunMyProcess)
  • service enabled integration solution (like Grand Central Communications once tried)
  • XML gateway/firewall that filters traffic, enforces policies and forwards requests to different back-ends
  • B2B hub (re-invention of the VAN)
  • SOA governance as a service (sharing services within a community)
  • Centralized SAML provider used by federation(s) of business partners
  • WS-Trust Security Token Service (STS)
To learn about cloud computing and see how it could be used for cloud based integration solutions, I went through the book "Programming Amazon Web Services" by James Murty.

Initial conclusion/impression after reading the book was that Amazon services still lack some important features. Some limitations of the Amazon services:
  • no fixed IP addresses, use of dynamic DNS required; with no URL pointing to the Amazon servers, own server in own data center required as main entry point
  • EC2 instances loose all state when they stop or die (partially addressed with backups at short intervals)
  • no SLA (except for S3), EC2 and SimpleDB are still "beta" (but so is gmail)
  • payment via credit card (no formal ordering/invoicing)
  • only forums to report and track problems, no formal communication channels
  • propagation latency of newly or updated S3 objects (without even guarantee that you retrieve the latest document yourself because there is no guarantee that S3 requests will be directed to the same location)
  • no transactions, e.g. when retrieving messages from SQS service
  • no relational database
  • no guarantees, everything on best effort
But since the publication of the book, a number of shortcomings were already addressed.
  • Elastic IP addresses now offer fixed IP addresses and do away with dynamic DNS. A single EC2 instance can load balance request to other instances.
  • Availability zones allow instances to be started in specific zones (read data center).
  • Persistent storage for EC2 now provides a real file system for EC2 surviving restarts (although file system can only be mounted by 1 instance).
  • And AWS Premium Support starts addressing the support issue.
So it seems that AWS can be used for some of my envisioned scenario's. Open source project will definitely offer their integration solution As A Service on the Amazon cloud. I'm curious to see if and when the 1st closed source integration vendor takes the same step.

Monday, July 28, 2008

Book review: SOA Security

Recently, a new book on Web Services Security was published: SOA Security. While on holiday in beautiful Italy, I went through the book. Obviously the book starts with some introduction on SOAP, WSDL and the like. But the major part of the book explains in quite some detail how to implement secure web services using JAX-RPC and Axis.

Two parts were most interesting to me personally. First of all chapter 4 and 5 on Kerberos and the "Secure Authentication with Kerberos": it was a joy to get a refresh on Kerberos, see some sample code using JAAS and GSS API's and see code snippets implementing the Kerberos Token Profile. And secondly Chapter 8 "Implementing security as a service": the different models in which web services can leverage SAML are well explained. Also the relationship with WS-Trust is made.

Sometimes I got the impression that the base material was written some time ago before being published. E.g. it is a pitty that the samples are based on JAX-RPC and Axis. Axis2 or JAX-WS would been better choices. Still, the book is OK, and don't be misled by the sample chapters online which are quite basic. But I keep hoping that one day, a 2nd edition of "Securing Web Services with WS-Security: Demystifying WS-Security, WS-Policy, SAML, XML Signature, and XML Encryption" will appear.

Note: it is clear that the authors have been working with or for CISCO as they use the CISCO specific acronym AON instead of XML gateway or similar

Tuesday, July 22, 2008

Free SOA books online

A couple of interesting SOA books can be freely accessed via pdfchm, a cheap alternative to Safari.

Tuesday, July 15, 2008

CapeClear

While accessing the blog of Annrai O'Toole, CEO of CapeClear, I learned that CapeClear was acquired by Workday. This already happened back in February, but it didn't get that news.

Remember well when CapeClear was one of the 1st vendors with a commercial SOAP stack. They even shipped me their 1.0 trial version back in 2001! Their SOAP stack evolved into an ESB. I'm not aware of any (major) organization in the Benelux using CapeClear.

CapeClear was also one of the 1st (from version 6.5) to have a persistent WS-ReliableMessaging implementation. So not in memory like many other WS-RM implementations, but with state and messages safely stored persistently in a database.

Wednesday, July 9, 2008

Idempotent services

Definition of idempotent: Refers to an operation that produces the same results no matter how many times it is performed." See also the "Idempotent receiver" pattern.

An idempotent service makes life much easier in the unreliable web services world. A service can be invoked multiple times without any side effects. This is highly useful when a service invocation times out (was the service executed yes/no). But also to re-try execution of (part of) a business process.

Making a service idempotent consists of 2 parts: duplicate detection and caching of response messages. Duplicate detection I discussed in an earlier posting. Returning of earlier response message is somewhat harder: each response message must be stored for a reasonable amount of time.

Idempotency is not mentioned in the WS-* specs nor in the WS-I basic profile. Making web services idempotent is left as an exercise to the designer/developer.

With REST, a number of verbs such as GET and DELETE are idempotent. POST (create) is obviously not idempotent. PUT (update) should be idempotent. I can believe PUT to be idempotent in a document world. But PUT is an update and therefore a change in my opinion.

As already explained in my earlier post, the ideal location to do duplicate detection and make services idempotent in the back-end services themselves. Interesting to learn about support for idempotency in recent SAP products (Netweaver 2004S SPS09 / ECC SE 600 SP03). SAP stores the response messages in a "response store".

Besides duplicate detection, let's ask ESB vendors to implement support for idempotent services through a "response store".

Tuesday, July 8, 2008

Duplicate detection

A basic mechanism to make a service idempotent is duplicate detection. For one-way services, duplicate detection may be sufficient, Request/response services require also that exactly the same response message is returned.

Duplicate detection isn't that hard:
  1. Unique value in the request. Multiple options exist:
    • business identifier such as purchase order number
    • hash of the input message or
    • technical identifier such as WS-Addressing MessageID
  2. Database table with unique constraint to manage the identifiers.
  3. Cleanup job to remove old entries from the above database table.
The ideal location to do duplicate detection is in the back-end services themselves. The service is invoked in a transactional context and has already access to a database. But it can also make sense to do duplicate detection in an intermediate such as an ESB. E.g. when an ESB switches between an unreliable protocol such as HTTP and a reliable persistent queuing protocol.

Integration solutions often need to maintain information about messages anyway. For
message monitoring, message archiving or non-repeduation (keep copy of signed messages). Duplicate detection can also be combined with protection against replay attacks (even required by the core WS-Security spec).

Mule comes with an Idempotent Receiver that provides duplicate detection.
Still, why other ESB's don't come with duplicate detection mechanisms is a mystery to me!

Monday, July 7, 2008

JSON

Many message formats have been defined for exchanging structured data: ASN, DCE RPC, ONC RPC, IIOP, DCOM IDL, , ... Many alphabets have been defined, on which new languages (message definitions or RPC contracts) were built!

Although XML had its roots in the document centric world (as a successor for SGML), XML gained an immense popularity in the world of structured data. XML was considered a good standard for self-describing messages: no mixed content but tree-structured messages with data only at the leafs of the tree. First DTD's and next XML schema's provided mechanisms to describe the XML messages.

Pushed by HP's eSpeak and XML-RPC, the SOAP protocol triggered the creation of the WS-* stack. But with the rise of AJAX and other Web 2.0 technologies, new alphabets emerge and gain popularity. JSON is one of those new kids on the block.

Rich Internet clients are not limited to HTML but use all sorts of message structures on top of HTTP. With the use of Javascript in the browser, a Javascript friendly protocol eases life. That's what JSON is all about. Apart from JSON, many other protocols such as Adobe AMF pop up (performance comparison, thanks Stephan for the link).

Now, what is the impact for the integration world. Of course, JSON and others will be used for a few A2A and B2B scenario's (although I haven't seen any yet). But many tools will need to extend their XML support not only to REST, but to many other protocols. Organizations will keep one main entry point for all their HTTP traffic. Next to HTML and XML, JSON and others will be added. Therefore, infrastructure will be impacted first, e.g. XML gateways will need to add JSON support change their name from XML gateway ;-)

Standard ways of using JSON are already popping up, e.g. JAX-RS (JSR-311) adopted Jettison for JSON support and includes JSON-XML converter which follows the Badgerfish approach for JSON-XML conversion.

JSON-XML deserves our attention: a widely accepted XML representation for JSON (don't use the word "standard") will allow re-use of transformation logic. But I have some doubts around this Badgerfish JSON-XML conversion. It takes XML as a starting point, e.g. it puts XML namespaces inside JSON which doesn't look nice.

Note: while attending SpringOne, I was caught by surprise when encountering the first use of JSON for a configuration file (for the new Spring container if I remember well)

Saturday, July 5, 2008

Sub-mappings like sub-routines

Mappings or transformations are one of the most important parts of integration flows. A lot of energy is spent on developing, maintaining and managing such transformation logic. In an XML world, transformations are often developed using XSLT (and some XQuery). But besides these nice standards, a lot of other more proprietary transformation solutions are used.

Just like with normal application logic, integration developers will always strive to re-use already existing interfaces and message formats. E.g. a new application or business partner will be pushed to re-use an already integration flow and corresponding message layout. Existing transformations maybe extended to support specific requirements, thereby going through quite some testing.

But "re-use" of smaller pieces of transformation logic is not done. The concept of invoking lower level pieces of transformation logic does not apply. There are no libraries from which smaller pieces of transformation logic are re-used in multiple mappings.

And this is not even the case within a single mapping. E.g. for transforming a structure occurring multiple times in a source and target message (take for instance a shipping and invoice address), no subroutine (such as TransformAddress() ) will be used.

There are a couple of reasons for this way-of-working:
- messages are often not constructed from standardized building blocks
- source and target message must contain the same data structures multiple times
- re-usable pieces of transformation logic are often small
- managing the re-use of transformation logic across mappings requires discipline
- tools and standards are not helping much
Still, I find it strange that "modularization" is not applicable in the world of integration.

Friday, July 4, 2008

EAI is dead, long live EAI

Andreas Egloff from Sun Microsystems gave a talk at the JavaOne AfterGlow event on project Fuji, which is Sun's initial development for OpenESB 3.0. During his talk, he repeatedly stated that the latest communication channels such as RSS feeds are not that different from other connectivity options such as file transfer. I couldn't approve more with his "EAI is dead, long live EAI". Indeed, many of the basic principles apply. It is not about revolution but evolution.

Note: at the same event, Stijn Van den Enden gave a great talk on JSR-277 and OSGi, congratulation Stijn!

Thursday, July 3, 2008

Oracle Service Bus

From the podcast Oracle Middleware Strategy Update, I learned more about what Oracle will be doing with the BEA SOA products and how they will integrate or co-exist with the Oracle SOA Suite.

Interesting to hear that AquaLogic Services Bus (ALSB) will become the 2nd ESB of the Oracle SOA Suite, called the Oracle Service Bus. This product is positioned as a "standalone" ESB, extremely fast, light-weight and highly functional, but tied to the BEA WebLogic application server.

Oracle will drop the BEA SmartConnect adapters and retain its own JCA adapters, with exceptions being e.g. the REST adapter from BEA. Oracle also goes for its own Web Services Manager for policy based security (based on security solution obtained from Oblix). Oracle will drop the BEA security solution and stop the relationship with Amberpoint. But Oracle will retain the BEA Event Server (Complex Event Processing) and BEA Repository (Flashline acquisition) as well as AquaLogic BPM (Fuego acquisition).

Interesting to learn that Oracle retains the BPM solution to complement their (strong) BPEL offering (Collaxa acquisition). Many integration vendors have 2 process management solutions: one more low level focusing on BPEL and integratino processes and one more high-level focusing on process modelling, human interaction (workflow) and business rules.

Oracle has made more podcasts available on other related BEA products. Nice to hear such clear statements from Oracle regarding the future of some of the BEA products.

Ethernet over powerline

The house I live in has quite some steel and concrete. This causes the Wifi connection upstairs to drop often and there is also some latency. Therefore I started digging around for a solution. A good friend of mine suggested the use of Ethernet over powerline, so use the electricity circuits in your home for transporting bits and bytes at high speeds. The maximum speed is currently 200 Mbit, which results in real throughput of maximum 50 to 60 Mbit. There are 2 camps: Homeplug AV with Intellon chipset and UPA with DSS9010 chipset from DS2. My good friend strongly recommended the 2nd option.

I bought a set of 2 powerline adapters from D-Link for 150 EUR, just plugged them in and voilĂ . The adapters work fine with latency below 4 ms and very stable connectivity.

For the security aware: Homeplug AV uses AES encryption and my UPA solution only uses 3DES. I don't know if and from what distance neighbors could try to break into this network connection.

Friday, June 27, 2008

JMS Correlaton Patterns

A customer came up with the suggestion to use the CorrelationID pattern for correlating JMS request and responses. In this pattern, a client sets a (preferably unique) value in the JMSCorreleationID property of the request. The service copies request.JMSCorrelationID to reply.JMSCorrelationID. This was a way-of-working I hadn't encountered.

There are 2 main JMS correlation mechanisms that I had seen so far: 1) the MessageID pattern whereby the service copies the request.JMSMessageID to the response.JMSCorrelationID and 2) the use of temporary queues.

In the Enterprise Integration Patterns book, the use of the CorrelationID as a "Conversation Identifier" is confirmed. And indeed, the CorrelationID pattern has some nice advantages, e.g. the correlation can be based on some unique identifier the JMS client picks itself (e.g. from the payload), no need to keep an extra JMSMessageID as an extra state variable.

Unfortunately, the SOAP-over-JMS draft also prescribes copying the JMSMessageID of the request to the JMSCorrelationID of the response message. And this is e.g. what Spring-WS implements.

Thursday, June 26, 2008

Book: Web Service Contract Design and Versioning for SOA

Just finished reading the early access version of the book "Web Service Contract Design and Versioning for SOA". Recommended book, well written!

As its title suggests, the primary focus is contract design. Less attention is paid to versioning. But the book goes into extensive detail wrt. XML Schema design, WSDL creation and use of related specs such as WS-Addressing and WS-Policy. Also the differences between different versions of e.g. SOAP and WSDL are well addressed.

This book really fills a gap: SOA books often remain at a too high level, standard WS development books dive directly into code and XML Schema books are unrelated to WS-*.

Every serious WS/BPEL/SOA developer or designer needs to have a good understanding of these base technologies, in particular XML schema. One of the better WS books since "Web Services Platform Architecture SOAP, WSDL, WS-Policy, WS-Addressing, WS-BPEL, WS-Reliable Messaging, and More".

Note: after reading such book, one must confess that WS-* is quite messy and more cleanup is needed; WS-I will need to write more or lengthier profiles!

Wednesday, June 25, 2008

HTTPS all the time?

Why don't web sites, web applications and web services use HTTPS by default? What prevents us from using HTTPS for all Internet communication? No more risk while accessing applications from public places such as hotels. No more risk of an ISP looking into your confidential network traffic.

Obviously, SSL takes some CPU power. I don't know how costly SSL is, but isn't this becoming negligible? On the other hand, there is SSL accelerator hardware being sold, so there must be some need for it.

Another challenge is the certificate management. Either services use one of the well known Certficate Authorities. But alternatively, clients should become better at managing self-signed server certificates or unknown CA certificates. Many client apps, including WS clients, would benefit from user friendly certificate and key mgt. No more Java keytool, but a user friendly configuration GUI.

Monday, June 16, 2008

XML gateways

If you're interested in learning more about XML gateways, 2 vendors provide reasonably detailed information. The technical documentation of CISCO's products is available online, both for their AON and ACE (ex-Reactivity) products. And IBM provides 4 Redbooks that go into quite some detail on DataPower. But other vendors such as Cast Iron, Layer7, Vordel and Intel (ex-Sarvega) all seem to consider their docs as a valuable assets not to be shared with the world.

Spring Integration is not an ESB

The Spring Framework is a popular application framework for Java Enterprise applications. Best known is its mechanism of "Inversion of control" or dependency injection. But the Spring Framework comes with many other features. Typically, Spring will define a clean set of interfaces and implementations on top of existing Java constructs.

Now, Spring is also popular in the world of integration: most open source ESB's - Mule, ServiceMix - are based on the Spring Framework. Recently Spring came up with its own "Spring Integration" sub-project. Although still in beta, this doesn't seem like a standard ESB. Rather, Spring Integration is more focused on integration "within" the application. Separating the integration logic from the business logic, but keeping it within the application itself. As such, one gets integration at the edges. In line with the Spring Framework itself, Spring Integration is obviously Java oriented, e.g. the Message object has a Java object as payload. It does not make any assumptions about payload format, XML or other. Spring Integration takes the Enterprise Integration Patterns book as a starting point, and that is something it does have in common with the many open source ESB's.

Another interesting sub-project is Spring Batch. This framework is focused on processing large data sets whereby the processing is split into multiple transactions and progress of batch jobs is maintained in database tables. This tool also comes with adapters and message conversion, but from a completely different angle.

Interesting to watch how the Spring team will grow these projects, along with Spring-WS and their upcoming REST implementation.

Monday, June 9, 2008

OSGi in Action

OSGi is becoming quite popular as the module system for Java Enterprise systems, including integration solutions. In open source integration solutions such as Mule, but also commercial ones such as Tibco ActiveMatrix.

The OSGi services look like a smaller scale SOA, within the Java VM. I can well imagine OSGi services being used to dynamically invoking different pieces of transformation or other integration logic from within an OSGi enabled ESB or mediation framework.

There were 2 sources of information that helped me get up-to-speed wrt OSGi: first of all the recorded talk on Paryles that Costin Leau gave about OSGi (and Spring) at the SpringOne conference. And secondly the draft chapters of the upcoming book "OSGi in Practice" by Neil Bartlett.

Wednesday, June 4, 2008

Creating XML schema's

When doing contract first Web Services design and development, the primary challenge is coming up with a good XML schema (XSD).

One of the options is to create a sample XML messages and generate an XML schema from that sample message. The XML editors Stylus Studio and XML Spy support the generation of an XML schema from an XML message. Unfortunately, my favorite XML editor - Oxygen - doesn't come with this feature.
Note: other otions are Trang and the xsd tool of the .Net framework

But apart from these 3 well known XML editors, I just came across a free XML editor that does support the generation of XML schema's: Liquid XML Studio as well. The Liquid XML editor also allows to easily add facets (minOccurs, maxOccurs, maxLength, ...) to the schema.

The generated schema's usually don't look nice and each tool has its own way of generating the XML schema. To re-factor the generated XML schema, the XML schema editor of the Eclipse IDE comes in handy. One of the options I like in particular is the re-factoring of an anonymous Element into a Complex type.

Tuesday, May 27, 2008

Tibco ActiveMatrix

Unbelievable: Tibco makes their latest product - ActiveMatrix - available for download! And some of the documentation is directly accessible. It is so much easier for independent consultants like me if vendors make such crucial information readily available.

Although both products have different backgrounds and different views, I recognize the same "distributed architecture" as in Fiorano. In particular the distributed execution with central configuration and management (most integration products are hub-and-spoke). And a strong JMS implementation underneath, TibcoEMS and FioranoMQ respectively. Obviously, Tibco ActiveMatrix focuses much more on (web) service mediation, policy enforcement and is a pretty advanced product.

But a more important player and competitor is BEA's AquaLogic Service Bus: ALSB also focuses on proxying web services, policy enforcement, service monitoring and management. But one difference between ActiveMatrix and AquaLogic is the hosting of business services: ActiveMatrix is a container for business services (even in .Net), while AquaLogic envisions services to be deployed in their WebLogic application server or 3rd party containers.

Looking forward to dive deeper into ActiveMatrix. And it is always interesting to see the architectural directions an important and independent vendor such as Tibco takes.

Monday, May 26, 2008

Reverse invoke / Reverse server

While browsing the latest SAP PI 7.1 documentation, I came across the latest doc of the SAP WebDispatcher. The WebDispatcher is a sort of reverse proxy that is very useful to restrict access to certain URLs of a web application server. The latest version of the WebDispatcher also supports "reverse invoke" or "reverse server".

A "reverse server" consists of 2 cooperating servers with a firewall blocking inbound connections. The server on the "inside" makes a number of connections to the server on the "outside". When the server on the outside receives a request, it forwards this request via one of these connections to the server at the "inside". The nice thing of this whole setup is that the firewall can remain closed.






"Reverse invoke" functionality is e.g. also available in WebMethods, where 2 integration servers cooperate. But is there any open source (Java) implementation that does something similar? Maybe a simple alternative is a small JMS server on the server at the "outside"?

Sunday, May 18, 2008

3 byte characters in 2 bytes?

Character encodings remain a challenge in many integration projects. Just had a customer asking: how can a 3 byte character (UTF-8) fit in a 2 byte (UTF-8) character?

Simple question I thought: (modern) programming languages and operating systems use 2 bytes to represent a single character. This gives room for 2^16 characters. Although not covering all characters, I was assuming that 2 bytes were sufficient.

Just learned that some characters in UTF-16 are encoded as 4 bytes (2 x 2 bytes). These are called surrogates. The first 2 bytes of such surrogate are in the range D800-DBFF, the last 2 bytes are in the range DC00-DFFF. As such, UTF-16 is developed to support little over a million characters.

For exchange of data in application integration scenario's, UTF-8 is recommended:
- No byte order, no need for a byte order marker
- No zero byte (making life easy for all those C-programmers
- ASCII represented unchanged
- Compact encoding of Western European characters

When diving into the Java doc of Char, the Char class is aware of surrogates (at least in 1.5). And I assume that some systems already use 4 bytes internally to represent characters, just to avoid the complexity of these surrogates (2 x 2 byte characters in UTF-16).

So another item for my totdo list: experiment a bit with conversion of text containing such surrogates from UTF-8 to UTF-16 and back. In particular in e.g. file adapters of integration solutions.

Saturday, May 17, 2008

The future of XML - great article

Ran into a great article by Elliot Rusty Harold: "The future of XML". Good to be reminded that XML was meant for publishing. Its use for remote procedure calls and object serialization was indeed not envisioned at its design time. And very nice to get a view on the future evolution of XML.

To me personally, the lack of good XML building blocks to define messages remains an important challenge. With XML being the alphabet, now we need a language on top, or at least some consensus. Message standards such as GS1, RosettaNet or UBL don't seem to be strong enough.

Back in 2001, I gave the talk "Understanding SOAP" at XML DevCon in London. I remember being at the speakers table with Elliot. As I was only an XML beginner, I didn't do much of the talking ;-)

Friday, May 16, 2008

Blog lift off

After more than 3 years at Apogado - company I co-founded - I decided to switch back and become an independent integration consultant and architect again. Please find below some older blog entries that I liked to keep.