Friday, August 29, 2008

More SOA books

I'm always on the lookout for good books in the area of Integration, SOA, BPM and Web Services. I recently skimmed through a couple. None of them are medal winners, but some parts are worth the read.

Through my ACM membership, I have access to a limited list of books on books24x7. One of them was "Enterprise Architecture and Integration - Methods, Implementation, and Technologies". Each chapter is written by a different group of authors. The quality of the different chapters and authors varies strongly. The best part is the 1st chapter, written by Wing Lam and Venky Shankararaman, the editors themselves. It is really great! The authors give a great overview on EAI and its relationship with SOA/BPM. The following chapters by differents authors are (in my opinion) of lower quality. The 4th chapter was interesting again as it discussed SAP Netweaver and SAP XI.
If you have access to books24x7, go check out that 1st chapter. But don't spend your money on the book itself.

Via pdfchm.com, I stumbled upon "SOA Approach to Integration - XML, Web Services, ESB and BPEL in real-world SOA projects" by by Matjaz B. Juric, Ramesh Loganathan, Poornachandra Sarang and Frank Jennings. The combination of SOA and Integration in the title set my expectations high. The 1st and 2nd chapter are nice introductory material. But further down, the chapters don't go into much detail. E.g. the BPEL chapter is really about the BPEL XML syntax.

Another book available on pdfchm.com is "SOA and WS-BPEL" by Yuli Vasiliev. The title of the book should rather be "PHP and Web Services". The book is well written. Chapters 1 to 4 go into The 5th chapter goes into BPEL; The 6th and last chapter shows how to implement an example using ActiveBPEL. And although I don't know much about PHP, the books looks very interesting for PHP developers.

Tuesday, August 19, 2008

Queuing in the cloud?

A post on the eai-select newsgroup mentioned OnlineMQ as an alternative to Amazon SQS. As stated in earlier posts, I think there is room and there are opportunities for cloud based integration solutions. A hosted queuing solution is a first step.

Therefore, I looked a bit around at the OnlineMQ website. I created a free ("Silver") account and played a bit around. The service will officially launch in September. The paid subscription ("Gold") is 60$/connection, no mention is made about maximum number or volume of messages.

Maximum message size is 256 KB, which is reasonable. Interesting feature is the option to configure a number of fixed IP addresses that are allowed to reach the server. It is typical in B2B solutions to work with fixed IP addresses and add extra security by only allowing access from the fixed IP addresses of your business partners. Which obviously requires everyone to have fixed IP addresses.

OpenMQ talks about JMS but doesn't seem to support the JMS API (yet). However, OpenMQ does support REST, POX (Plain Old XML) and SOAP interfaces. For the SOAP interface, it uses the (non WS-I compliant) rpc/encoding. And security is not based on WS-Security.

It would be interesting to know more about the company or individuals behind this initiative. Even more as the company seems to be located nearby in the Netherlands. The online agreement refers to UK law being applicable, so that points to a British initiative. From a reverse DNS lookup, I learn that the IP address is owned by Level3. So the servers are located at a Level3 facility.

Friday, August 8, 2008

AMQP enthusiasm?

All existing messaging solutions (WebSphereMQ, JMS, ...) use proprietary protocols. This is not a problem within a single organization. But between organizations, standard protocols are needed. Therefore, the B2B world uses protocols such as AS2, RNIF (RosettaNet) or good old (S)FTP(S).

AMQP is an initiative to bring a standard binary wire protocol to the messaging world. Just like POP3+SMTP allows you to retrieve and send emails using whatever email server, AMQP will allow any AMQP client to receive and send messages via any AMQP compliant server.

But when I read the spec, AMQP is focusing on the client-server protocol, contrary to SMTP that is (also) used for communication between mail servers. The AMQP spec states that a bridge should be used for server-2-server communication, but doesn't provide any details. As such, AMQP is focusing on messaging within the corporate firewall.

AMQP can be used for unbalanced B2B scenario's, where one side runs the AMQP broker. This is a setup similar to one big company or intermediary running an (S)FTP(S) server and smaller organizations putting and retrieving files from it. But for good decoupling, server-to-server communication is preferred. The server at the sending side will take care of delivering the message to the server at the opposite side. Like e.g. AS2 does: once an organization has an AS2 server in place, it becomes equal to all its AS2 counterparts.

With all this in mind, I was a bit puzzled by Paul Fremantle's enthusiasm about AMQP. In particular because he is the WS-RM spec lead.

WS-ReliableMessaging should have brought reliable async messaging to the WS-world. But it didn't. The WS-RM spec doesn't mention message persistence and so (most) vendors have an in-memory implementation, which is not reliable.

I still remember going through the book "Programming Indigo" and learning about the ReliableSessionEnabled binding property. What a disappointment to learn that for real reliability, one had to use the MsmqIntegration Binding and thus the proprietary MSMQ transport layer.

Monday, August 4, 2008

Amazon Web Services - Book review

While enjoying holidays, I read the book "Programming Amazon Web Services" by James Murty. As explained in my earlier post, I was most interested to learn how cloud computing could be leveraged for developing integration solutions.

The book discusses 5 Amazon Web Services (AWS):
  • Simple Storage Service (S3)
  • Elastic Cloud Computing (EC2), virtual Linux servers on demand
  • Simple Queue Service (SQS), to deliver short messages
  • Flexible Payment Service
  • SimpleDB - simple database with no SQL support
The book goes into quite some technical detail and has code snippets showing in detail how to interact with the Amazon services. All the samples are written in Ruby. I don't know Ruby, but the code is quite readable (should read Enterprise Integration with Ruby some day). The author prefers the REST and the Query API. Unfortunately, he does not show anywhere the use of the SOAP API to access Amazon WS.

The 1st chapter is introductory and e.g. explains how to use self-signed certificates to connect with AWS, explains how AWS were developed for internal use by Amazon and later turned into a products, come without an SLA (except for S3) and without real support.

In the 2nd chapter, the author builds up a library of Ruby code to access the Amazon Web Services. This is very well written and gives an immediate feeling for some aspects to take into account, e.g. clock differences.

S3 is covered in chapters 3 and 4. No standard file access but the use of buckets and objects through a non-standard API (REST or SOAP); no FTP, WebDAV or SFTP. And objects cannot be modified: only deleted and re-created (after the deletion has propagated). Ruby code is shown for all the options the API offers: bucket creation/lookup/deletion, object creation/listing/deletion, ACL update/retrieval and access logging file retrieval. Tricks with HTTP header fields (object metadata), posting data through forms, alternative hostnames and BitTorrent are discussed. The last part discusses signed URI's: this is a neat trick to make S3 resources temporarily accessible to users without Amazon account.

Chapter 4 shows some applications of the S3 service: large file transfer, backup, turning S3 into a file system (with FTP or WebDAV). Interesting to note that the author has his doubts wrt. exposing S3 as a file system. The author also discusses his own Java open source application: JetS3t. This application is a "gatekeeper" for S3 resources and authorizes local agent applications after acquiring signed URL to upload files to S3 and download files from S3.

Chapter 5, 6 and 7 dive into EC2 and how virtual Linux systems (based on Xen) can be configured using Amazon Machine Images. Ruby code is shown for every available API: keypairs (for SSH access), network security (dynamically configure the firewall), images and instances. Chapter 6 explains instances in more detail and discusses how to create new images. This involves quite some commands and scripts at the Linux command prompt. Chapter 7 discusses some sample applications: VPN server, web photo album thereby backing up data on S3. Chapter 7 also discusses issues around dynamically assigned IP addresses and the use of dynamic DNS.

The Simple Queue Service (SQS) is discussed in chapters 8 and 9. Because of the small message size, SQS is clearly meant for events with actual data stored on S3 (or elsewhere). Again Ruby code to manipulate queues and messages. Chapter 9 describes a Messaging Simulator application, not that relevant in my opinion. The 2nd application - leveraging a video conversion tool - shows how to build generic service for implementing "batch" services (Command Message pattern). The 3rd application - LifeGuard - leverages SQS to manage EC2 instance pools and dynamically scale the number of EC2 instances.

The chapter on payment service I skipped and I only skimmed through the SimpleDB chapter. Enough to learn that SimpleDB is not an RDBMS but a basic storage mechanism (no data types) with proprietary query facilities (no SQL).

The author writes fluently and gives a non-biased view on the Amazon Web Services. Sometimes the code goes into too much detail, showing how to invoke every available method of the API. Although the book is very recent (March 2008), important new features such as elastic IP addresses, persistent storage for EC2 and availability zones weren't yet available at the time of writing. The book definitely taught me that AWS is quite proprietary and not that trivial. And to use Amazon's cloud computing and AWS, you'd better "think like Amazon".

Friday, August 1, 2008

Amazon Web Services

When the Amazon Simple Queue Service appeared about 2 years ago, I looked into it as a solution of message exchange between business partners. With its limited message size (only 4K) and message retrieval based on simple Amazon user accounts, I put the offering aside. The Amazon offering did however trigger me into wondering when the big Internet players (Google, Microsoft, Salesforce or Amazon) would enter the integration market.

But the 'big' players aren't entering the integration world (yet?). Alternatively, software vendors could cloud-enable their software. Or anyone could leverage the cloud and develop an integration system on system on top of it.

Some scenario's that I can envision, probably lacking some imagination here ;-)
  • upload messages to the cloud from which they can be polled and retrieved (e.g. some central FTP or JMS server)
  • service composition (something like Splice)
  • business processes managed in the cloud through BPEL process engine (like RunMyProcess)
  • service enabled integration solution (like Grand Central Communications once tried)
  • XML gateway/firewall that filters traffic, enforces policies and forwards requests to different back-ends
  • B2B hub (re-invention of the VAN)
  • SOA governance as a service (sharing services within a community)
  • Centralized SAML provider used by federation(s) of business partners
  • WS-Trust Security Token Service (STS)
To learn about cloud computing and see how it could be used for cloud based integration solutions, I went through the book "Programming Amazon Web Services" by James Murty.

Initial conclusion/impression after reading the book was that Amazon services still lack some important features. Some limitations of the Amazon services:
  • no fixed IP addresses, use of dynamic DNS required; with no URL pointing to the Amazon servers, own server in own data center required as main entry point
  • EC2 instances loose all state when they stop or die (partially addressed with backups at short intervals)
  • no SLA (except for S3), EC2 and SimpleDB are still "beta" (but so is gmail)
  • payment via credit card (no formal ordering/invoicing)
  • only forums to report and track problems, no formal communication channels
  • propagation latency of newly or updated S3 objects (without even guarantee that you retrieve the latest document yourself because there is no guarantee that S3 requests will be directed to the same location)
  • no transactions, e.g. when retrieving messages from SQS service
  • no relational database
  • no guarantees, everything on best effort
But since the publication of the book, a number of shortcomings were already addressed.
  • Elastic IP addresses now offer fixed IP addresses and do away with dynamic DNS. A single EC2 instance can load balance request to other instances.
  • Availability zones allow instances to be started in specific zones (read data center).
  • Persistent storage for EC2 now provides a real file system for EC2 surviving restarts (although file system can only be mounted by 1 instance).
  • And AWS Premium Support starts addressing the support issue.
So it seems that AWS can be used for some of my envisioned scenario's. Open source project will definitely offer their integration solution As A Service on the Amazon cloud. I'm curious to see if and when the 1st closed source integration vendor takes the same step.