Tuesday, October 30, 2012

Web Service incompatibilities

SOAP Web Services have lost quite some of their popularity: too complex, incompatibilities etc.  My answer is always that 1) SOAP just adds a very simple envelope around the request and response messages and 2) SOAP does work fine when you stick to the rules (a copy of a slide I use in my training classes):

Just recently I had encountered 2 nice examples of SOAP incompatibilities.

Cookies and SOAP

While investigating the web services API of a cloud SAAS application, encountered another example how things should not be done.  First of all it was not "stateless" but required the use of a login and logout operation. With security not based on standard HTTP basic authentication or WS-Security, but a proprietary scheme:

  <urn:credential>
    <urn:companyId>company-id</urn:companyId>
    <urn:username>user-name</urn:username>
    <urn:password>password</urn:password>
  </urn:credential>


But then came the surprise: the login operation returns a session handle which is actually a cookie!  The cookie is to be passed as an HTTP header in each subsequent web service.  Had seen many ways to make web service implementations incompatible, but is one for the top 5!  Obviously most web service clients require some hack to pass this cookie along the SOAP request.

Doc/literal with 2 parts

A more subtle challenge came recently by at a customer: the IBM DataPower ESB refused to import the WSDL file an Oracle product.  The web service used the document/literal style and one of the operations had a request message consisting of 2 parts.  So who was wrong and who was right: IBM or Oracle?

SOAP went through some growing pains in the beginning. The initial idea was an RPC mechanism whereby an operation could have multiple parameters. These parameters are passed as multiple parts in a request and response message. But with a better understanding of XML and XML schema's, the world move to a model whereby XML documents were passed. Microsoft introduced the document/literal wrapped style whereby the root contains the name of the operation.
<soap:Envelope xmlns:soap="http://www.w3.org/2001/12/soap-envelope">
  <soap:Body>
    <OperationName>
      actual XML document...
    </operationName>
  <soap:Body>
</soap:Envelope
</soap:Envelope>

So my initial response was, document/literal web services should only have one part and Oracle is wrong. But a colleague pointed to the fact that Oracle would not implement web services that violate the standards. And indeed, the IBM article clearly explains that a document/literal web service can have multiple parts in a message.

The WS-I Basic Profile was an initiative to sharpen the rules and states: "R2201 A document-literal binding in a DESCRIPTION MUST, in each of its soapbind:body element(s), have at most one part listed in the parts attribute, if the parts attribute is specified.". So the Oracle web service is not WS-I basic compliant but does not violate the SOAP/WSDL specifications.

Again a situation where one has to go for workaround, this time in the DataPower ESB. Had IBM implemented the specs correctly and/or Oracle stuck to the widely accepted ways-of-working and the WS-I Basic profile, everything would have worked smoothly.

Sunday, October 28, 2012

IBM DataPower as ESX appliance


i8c does quite some work with IBM WebSphere DataPower. And counts a number of experienced and certified DataPower developers and architects. DataPower is great at securing and mediating web services.



Colleague Kim came up with a very interesting evolution: DataPower as a virtual appliance, so the XG45 gateway and XI52 integration appliance as ESX images.


This is similar to the CastIron offering of IBM that comes with on-premise hardware and virtual appliances. Right now we have a CastIron instance running on our ESX server (i8c employees can deploy their images either on an ESX server or a Windows server running HyperV).

Note: together with Joris Verberckmoes (Lead Designer SOA Services @ Gdf-Suez) I'll be giving a presentation at the Belgian WebSphere User Group on Nov 27; topic is the use of DataPower @ Gdf-Suez

IAAS players: open-source vs. commercial

While spending time in traffic jams on the E19 on my way to customers in Brussels, listened to an interesting episode of the CloudComputingPodcast by David Linthicum: James Staten of Forrested gave his view on the different IAAS cloud offerings. This interview was trigger by an article written by James Staten. Interesting to hear James evaluate and categorized all the major players.  Below some notes I made while listening to the podcast.
  • Open source
    • Eucalyptis
      • Clean room implementation of EC2
      • Was very popular
      • Eucalyptis moved focus from community to building up company and lost focus
    • http://www.openstack.org/
      • Joint project of Nasa & Rackspace, compatible with the Amazon API's
      • Just arrived in time to take over the momentum of Eucalyptis
      • Maturing technology, still important code chunks of code required to be added by vendors
      • Expected to become very solid
      • Not generating a lot of revenue but generating a lot of attention
      • Vendors need to wait for Openstack to become more mature; how long will vendors have patience for Openstack to generate revenue and profit?
      • Many commercial vendors contributing code to it: Rackspace, IBM, VMWare, Redhat, Cisco, Dell, HP, ... (10 million dollar)
      • Adoption of Openstack is still low, Rightscale just recently move to OpenStack for its own cloud offering
      • The "Linux of the cloud world": kernel will become strong with all contributions coming back
      • Participants want to make large revenues with it or at least weaken their competitors; in particular HP and IBM have long standing reputations to contribute back to the communitry
      • Smaller participants will position themselves in specific niche markets
      • Openstack is an open source project, not an open source standard
    • Cloudstack
      • Cloudstack is more mature
      • Cloudstack was acquired by Citrix
      • Only one big distributor: Citrix
      • Citrix donated Cloudstack to the Apache community
      • Ready to generate revenue now
      • Lacks the "momentum" of Openstack
  • Commercial leaders
    • Amazon AWS 
      • Supported by large like Accenture & Deloitte
      • Also still maturing,but further down the road
    • VMWare vCloud Director

      • Managed services provided by Deloitte, Accenture and other
What to bet on for now? Amazon on the public side and VMWare on the private side. Openstack may mature - similar to Linux - in 2 or 3 years. Amazon is a strong player but not yet a dominator of the public cloud market.  But in the private cloud world, most customers are still doing static virtualizaton, "still a lot of ground to be taken".

This podcast was really good, recommended!

Thursday, October 25, 2012

Message formats: death of XML?

SOAP web services are becoming old-fashioned.  The REST approach is really taking off.  A lot of things are moving in the world of protocols and data serialization:
AvroGoogle

And a long list of other message formats. This evolution brings a "have seen, done that" feeling.  I remember very well the CORBA and (D)COM wars, with the respective binary protocols. And the use of IDL (Interface Description Language) to describe message formats/structures.

XML has its strengths (Internationalization, human readable, schema language) and its weaknesses (verbose, complex, XML namespaces). But XML is - eh was - a well accepted message format, supported by all sorts of tools, in particular ESB's.

In the REST vs. SOAP debate, I often get the argument that SOAP is complex.  When sticking to the basics, SOAP is a very basic envelope around an XML data structure. 
<soap:Envelope>
  <soap:Body>
    <GetPhoneNumberInfo>
      <PhoneNumber>0479273658</PhoneNumber>
    </GetPhoneNumberInfo>
  <soap:Body>
<soap:Envelope>
Yes, it is document/literal wrapped style, but complex?

Had Javascript (soap.js?) been there from the start to ease the live of Web developers, things might have looked differently.  The use of the (ugly) DOM model to represent and manipulate HTML page structures is accepted.  But the use of XML for message payloads is not acceptable.  With JSON being the big winner.

OK, XML may be getting outdated.  But please, let's come up with a widely accepted, well standardized alternative.  And let's stick to the human readable alternatives, life of all the IT support people is already challenging enough.  And let's make the security guys happy - and ourselves - with well defined schema language(s) !  Maybe Schematron?

Note: Remember well how my colleague Luc Gevaert came up with a human readable, compact message format for service oriented solutions.  We implemented the "Generieke Middleware Laag" at Interpolis in 1998/1999 and the "Generieke Service Laag" at Rabobank in 2000/2001.

Note: A related topic on my to-do list is translation between JSON and XML, not a trivial subject.

Tuesday, October 23, 2012

ZeroMQ - 0MQ

Had heard the name "ZeroMQ" a couple of times, but now I dove a bit deeper into it.  First learning point: it is ØMQ rather than zeromq.  Second finding: ØMQ is not Message Oriented Middleware like WebSphereMQ, all the JMS implementations or MSMQ.  ØMQ is rather a library for building all sorts of distributed applications.  Contrary to MOM, it is fully distributed and does not rely on a central broker.




Messaging for Many ApplicationsDidn't experiment with ØMQ, but browsed the ØMQ guide.  The guide goes quite rapidly in depth.  Must confess that I did not fully succeed in grasping ØMQ.  I found the zeromq API - the manpages - to be more comprehensible.  There's also a complete book in the works at O"Reilly.

ØMQ is written in C++ and the examples in the docs are C oriented.  Fine with me, brought me back 10 or 12 years: remember well implementing a "Generic Service Layer" with Netweave, in C at Rabobank.  The Netweave NWDS API was largely based on callbacks and therefore also asynchronous.

Another no-broker messaging solution I've workd with is Tibco RendezVous.  Tibco is also a high speed messaging solution.  Differences between ØMQ and Tibco: RV focuses exclusively on pub/sub and supports persistent messages by stored messages in transit in its ledger file.

ØMQ is largely driven by Pieter Hintjens of iMatix, iMatix acquired ØMQ and in particular the brand from ØMQ's developer Martin Sústrik (company FastMQ).  iMatix and Pieter Hintjens developed AMQP for JPMorgen but turned away from it in favor of ØMQ.  There's an interview with Pieter Hintjens available on FLOSS Weekly.  And as his name already suggested, Pieter Hintjens is indeed from Belgium.


Beginning of 2012, there was a split in the ØMQ world: Martin Sustrik and Martin Lucina created their own company again - Crossroads I/O - and created a fork of zeromq.  We'll need to watch how things evolve in zeromq land.


Note: while learning about ØMQ, I also saw a lot of criticism on AMQP.  Another topic to explore in the future: how is the AMQP protocol doing?  And what are its strengths and weaknesses?  Is is really too complex?

Monday, October 22, 2012

SHA-3... from Belgium

While enjoying listening to another episode of SecurityNow, the "explainer-in-chief" touched upon the topic of the new hashing algorithm SHA-3.  Approximately 10 years ago, the US standardization organization NIST selected a new symmetric encryption algorithm, AES or Rijndael, to replace the old 3DES algorithm.  Rijndael was invented by 2 Belgian scientists from the University of Leuven, Joan Daemen and Vincent Rijmen.

NIST had launched another competition, this time to select a new hashing algorithm. SHA-2 is not broken, but it seems that NIST wanted another hashing algorithm based on completely different technology. If either SHA-2 of SHA-3 would be broken, the other would no be impacted.

Nice to note that 3 Belgians and an Italian came up with SHA-3, congratulations! SHA-3 or "Keccak" is based on the sponge algorithm (very good reading material before going to bed...).

Sunday, October 21, 2012

Zookeeper for ESB

From colleague Marc Kimpe, I received a link to an introductory Hadoop-As-As-Service article on InfoQ.  One of the things that struck me, was the use of all the other Apache projects that are combined with Hadoop, in particular Zookeeper.

Zookeeper is a centralized service for maintaining configuration information, naming, providing distributed synchronization, and providing group services.  Yahoo uses Zookeeper to maintain configuration data for its Yahoo! Message Broker (had never heard of it, can't find detailed info on it).
Zookeeper

Would any ESB be using Zookeeper?  Yes, AdroitLogic UltraESB and Talend ESB are.  UltraESB for its configuration data.  Talend ESB uses Zookeeper primarily as a Service Locator.

And these 2 ESB's brought me (back) to esbperformance.org, where the performance of open source ESB's is benchmarked.  Unclear how neutral esbperformance.org is, but interesting to see how some ESB's simply fail and how the driver behind the latest execution round - UltraESB - also is the "winner".  From the AdroitLogic website I learn that the company was founded by a number of ex- WSO2 employees.s

If any Flemish students would be interested to do an internship on ESB Performance, take a look at the "I8C stages", and contact me if motivated.

Saturday, October 20, 2012

Hadoop for spying you?

Just watched the Youtube video "Introducing Apache Hadoop: The Modern Data Operating System".  Interesting presentation given by Amr Awadallah of Cloudera at Stanford University.

During the Q&A round, Mr Awadallah referred to one of their customers - Skyboximaging - that is setting up a large scale Hadoop infrastructure.  Skyboximaging will be launching small, low cost satellites that can monitor all sorts of things happening on the ground, so to provide up-to-date information - HD Video and photographs - about how things look on the ground.

This up-to-date information can be used for all sorts of purposes.  But Mr Awdallah also referred to some nice use cases: how many cars are on the parking lot of your competitor, what is being loaded into trucks.  "Everything you can see from the key is public".  And Hadoop will be used to process these large volumes of data, so help the spying!

Slide from a presentation given at Hadoop world in 2011.