Following on Twitter Using RSS

When I am trying to decide whether to follow a given account on Twitter or not, I usually look at the following three criteria:

  1. whether I am interested in this account's tweets
  2. whether this account's tweets include at least some degree of real-time relevance
  3. whether this account can participate in a discussion

Turns out however there are plenty of accounts that are missing properties #2 and/or #3. Bots that tweet links on a specific topic, for example. Or a celebrity comedian such as @shitmydadsays - this account probably won’t respond to any mentions and the tweets are rarely real-time sensitive.

I found it’s much more efficient to consume such tweets not via Twitter, but via RSS. Twitter used to include a feed on every account’s home page but not anymore. Here is how you can follow a Twitter account via RSS.

Let’s say you want to follow @StephenAtHome.

Open this link in your browser:

https://api.twitter.com/1/users/show.xml?screen_name=StephenAtHome

(If you prefer JSON, use https://api.twitter.com/1/users/show.json?screen_name=StephenAtHome).

Note user id value - for @StephenAtHome it’s 16303106.

Then add the following feed to your reader:

http://twitter.com/statuses/user_timeline/16303106.rss

This method helps me better manage my Twitter reading experience by ensuring real-time sensitive content and conversations go to Tweetdeck and the rest ends up in Google Reader.

Categories: internet |

Network: From Hardware Past To Software Future

At this year’s GigaOm Structure conference, there was a single event that attracted my interest the most - network virtualization panel (I didn’t attend the conference, I was only following along over the Internet). It wasn’t just because it involved OpenFlow. I think there is a bigger trend at play here - a lot functionality that we are used to seeing in network gear is moving to application level, from hardware to software. OpenFlow is just one of the manifestations of this bigger trend. Let me explain.

Networking was first about moving packets, in large quantities and with low latencies. This demand was met by specialized hardware which I assume was able to perform the job better than a general-purpose machine (“better” in this context means faster, more reliably and more cheaply). From their early days, network vendors have also extensively focused on what developers of modern distributed or hyper-distributed applications focus today - failure detection, fault tolerance. When application servers were still growing vertically (bigger machines with redundant power supplies, for example), network already was using distributed gossip-like protocols to exchange information.

Over time, however, more and more services found their home within the network layer - load balancing, virtual addresses, traffic encryption and so on. The idea was to let application remain unaware of all of this complexity on top of which it was sitting.

While this approach had been working for a while, it ran into a wall. Firstly, without direct control over network from applications, current setups were always extremely inflexible and high maintenance (dedicated network engineering staff, change management process in addition to application code rollouts, etc). Secondly, features baked into hardware take longer to tweak (unless vendor had sufficient foresight to plan for new requirements). Thirdly, hardware is harder to replace from financial perspective (pay up-front + maintenance).

Final hit was delivered relatively recently by infrastructure-as-a-code. Flexible IaaS models can’t effectively support customers’ hardware. While there are places where hardware is still very visible to customers (VPN connectivity from customers’ datacenters to their IaaS resources), this is a temporary phenomenon - there are numerous IaaS-compatible software solutions already (please see my disclosure in upper right).

Furthermore, a lot of non-packet-moving functionality can be efficiently delivered in software these days. Look at Heroku - their frontend routing mesh is a massively-scalable load balancer that could be tweaked in real-time. Good luck trying to accomplish the same in hardware.

We currently think of Ciscos and Junipers of the world as hardware vendors. What they actually are is software companies - they just don't let their software run anywhere except on their own hardware. I bet we are going to see this transformation play out within the next 3-5 years. In not so distant future, network gear will go back to focusing on one thing they do exceptionally well - moving packets. All other functionality will turn into software products and will be used on application servers.

Categories: cloud-computing | infrastructure-development |

Two Weeks on Twitter Without Reading My Timeline

TL;DR Twitter reading experience is extremely inflexible and not scalable, and the company discourages third-party developers from innovating in general-client niche. Twitter must significantly improve reading experience, or allow third-party developers more freedom.


In the first half of this month, I decided to perform an experiment. For at least two weeks I didn’t read my Twitter timeline. I only sent an occasional tweet or replied if necessary (the plan was to reply to mentions and to tweets surfaced by multiple searches that I read via RSS).

What could be the point of such a weird arrangement? Public tweets in general form a basis of three distinct activities - publishing, participating in a conversation and reading (Twitter as a whole also supports one-to-one private messaging via DMs).

Each activity delivers its own benefits at the cost of efforts to focus mentally and time. Reading is unique among them however because in a system based on following other accounts (where each account is free to publish anything they want), its signal-to-noise ratio is significantly lower than that of other activities.

Lower signal-to-noise ratio leads to higher costs (mental focus and time spent). As such, information I obtain via reading my Twitter timeline is relatively costly to me. The goal of the experiment was to see if I could replace Twitter timeline with a less costly way of obtaining the same information.

Turns out I couldn’t do it easily. Reading blogs as I always do, checking Techmeme and Hacker News kept me informed about the most important news but the color added by many folks I follow on Twitter, was missing.

This outcome was somewhat expected. But there was another thing that I realized during the experiment. Twitter the company stopped paying attention to reading experience (lists was their last innovation there). Even more worryingly, it is my understanding that they actively discourage third party developers from building general-purpose Twitter clients. This leaves their official stance - "river of updates" - to be the only way of consuming (reading) one's timeline.

Maybe “river of updates” is the best approach for many people (even though I doubt it). Maybe even for most. But saying it’s the best experience absolutely for all is a stretch. I want bookmarks (plural is not a typo), I want ability to sort my timeline by attributes other than time (for example - location, sender), I want “always on top” attribute, I want filters that could be shared between users - in addition to obvious creteria such as sender, time and location, I want advanced things such as current rate of my timeline (how many tweets per minute are appearing in my timeline now), send rate of sender (how many tweets per minute the sender sent on average last minute, last 5 minutes and last 15 minutes).

Granted, I don’t mind if Twitter itself doesn’t feel that these are features worthy of their official client. But if it’s the case, Twitter must not discourage third-party clients either. And if Twitter sticks to its guns on this, I hope it won’t be too long before it’s overtaken by someone else who will provide a better reading experience.

Categories: internet |

JSON vs XML in API

George Reese recently wrote a blog post about API design, William Vambenepe commented here. This is an interesting topic, I have a post on this subject too - it’s titled Developing API Server - Practical Rules of Thumb. In this post I would like to expand on the first point George made in his post - JSON vs XML.

As you may know, I led design and development of VPN-Cubed API at CohesiveFT, therefore I am approaching this subject primarily from perspective of API server side, not client.

We designed VPN-Cubed API to be able to support both JSON and XML. Since GET requests in HTTP have no body, arguments must be passed in query string. For all other HTTP methods, we take the arguments as a hash, convert them to JSON (or XML), set Content-Type header appropriately and send the resulting representation of arguments as a body of our request. Client also selects the format in which it wants to get response by using Accept header. We assume default value of application/json - which means JSON responses are sent by default.

Despite being designed for both JSON and XML, VPN-Cubed API ended up shipped with JSON only. And there is a reason why we chose JSON over XML.

Generally speaking, API is exchange of messages - client submits a request, server returns its response (example: “need new instance with these parameters” - “here is current representation of the instance object you requested”). In most domains, overwhelming majority of messages map nicely to nested lists (arrays) and hashes (dictionaries). This is a key insight that plays a role in JSON vs XML battle.

There is no easy and universal way how to represent nested hashes and arrays using XML (if there is, I hope to hear from you about it - I need stable libraries that can convert arrays and hashes to XML and back that could interop among each other for all major programming languages). Of course it’s possible and not terribly difficult, but it’s something that one must do.

Contrast this situation with JSON - you don’t need to worry about this, it’s already taken care of for you. The only limitation of JSON that we faced was that JSON doesn’t like integers as hash keys, it wants you to convert them to strings or use an array instead of a hash.

There are certainly some features in XML that JSON doesn’t have, but this is a show stopper. While your mileage may vary, I think this is the biggest reason why JSON has been slowly making ground against XML in API world recently. See also this post on Programmable Web.

Categories: software-engineering | cohesiveft |

IaaS vs PaaS

For a very long time, I had regarded platform-as-a-service (PaaS) as a catch-all bucket for everything cloudy that was not software delivered over Internet on demand (SaaS) or infrastructure (IaaS). Over the past several months however, with announcement of new players in PaaS space such as CloudFoundry and OpenShift, I found myself thinking about PaaS in a new light.

PaaS currently seems to be converging on a concept that is essentially an expanded application server (typical examples of application servers are Tomcat, Weblogic, Websphere, Glassfish, JBoss etc). You package your web application in a certain way and upload it to the server. Server then sets up the environment (such as, for example, your database connection pools) and runs your app.

PaaS of course adds a few twists (examples of functionality that PaaS could offer include multitenancy, autoscaling, API, off-premises hosting, multi-language) but fundamentally it essentially feels to me like a glorified application server.

Several observations.

Firstly, every big software vendor seems to have at least one product in its current lineup that in some shape or form fits into the application server space. I expect each of these vendors to repackage their offerings into a PaaS or PaaS-like product - the more the merrier.

Secondly, the more I think about it, the more I become convinced that a private PaaS will dominate private IaaS at enterprises for applications developed in-house. If a company adopts one of the application servers today as an internal standard, it simply makes no sense to allow internal development of any applications that would not run on them.

Thirdly, you gotta hand it to Google - when everyone was crazy about a cloud model popularized by Amazon EC2, they didn’t cave in and didn’t start offering low-level OS VMs. They have focused on language VMs (Python VM, JVM) and up since the very beginning - this looks exactly what PaaS has become now. In their latest release, they added backends for long-running background processes (in other words, all daemons that do not fit HTTP request-response model). I expect other PaaS implementations to follow suit.

Fourthly (as a direct consequence of points #2 and #3 above), I now think that private IaaS clouds will become a place where enterprises run their vendor-supplied (possibly closed-source) non-web-based workloads. As a result, software vendors will need to adopt new ways how they distribute their software. There will be no need to do installers and try to detect a machine’s hardware and OS. All software can be shipped as a VM image (with or without customer access, or maybe just partial customer access).

And finally, I am now convinced that today’s PaaS moniker should become application server as a service. Or - to make the acronym easier to pronounce - a webapp container as a service (WCaaS or ACaaS). There is simply too much “platform” beyond an application server use case - think data store as a service, messaging bus as a service, external connectivity as a service, load balancing as a service, naming as a service, and so on. Each of these could be a standalone service.

Good times for cloud computing!

Categories: cloud-computing |

Previous Page
Next Page