Why is the facebook API so crappy?

(This isn’t going to be a rant. I’ll try to figure out why the problems with the facebook graph API exist)

The other day an article got popular on dzone. It indicated the same problems that me and some colleagues of mine have have been experiencing, and confirmed my suspicion – it’s not just me – the facebook graph API is bad. Guess which is the longest method in the welshare codebase (my startup project where I’m using the graph API extensively). It’s called postToMessage and obtains all the data I need to display the a facebook message. All code analysis tools that I use report problems with it: “Cyclomatic Complexity is 37 (max allowed is 7); NCSS for this method is 78 (max allowed is 50);”. And this is the only method that violates all these. I could split it to reduce the cyclomatic complexity a bit, but the code will still be bad. Why? Because the facebook API forces the code that is using it to do “ugly” things in order to get everything it needs.

Here’s a (incomplete) list of what is currently wrong with the facebook API:

  • It’s a mess. For example there is the Post object which should contain everything about a wall post. But it is really tedious to extract information from it. See the table I computed in a previous posting. In short – anything can be in any field
  • It changes constantly. For a few days it just doesn’t fetch comments for some posts, for example. At some point you got the number of likes, another time you don’t. Not only the contents of fields change, but also the structure of the response. As a result API wrappers need to change as well.
  • There is no API version. This is something I’m amazed about. How come an API does not have a version? If there was, facebook would be able to make changes without affecting clients. Now your app just stops working
  • Some functionality is not available through the API. For example mentioning users or adding the “Share” button. This is basic stuff and should be there. Some functionality is provided by the old API. So it makes each project use two APIs

And this is supposed to be the new and better API that is replacing the “old REST API”. I guess I should be glad I haven’t used it extensively.

But why is it so bad? Here is a list of possible reasons:

  • Engineers are not good enough. Yes, this sounds impossible – the best engineers go to facebook (and google), right? But there are many different areas of software engineering. If facebook interviews are as computer-science intensive as those at Google then they have really great talent for writing complex algorithms, data structures, distribution mechanisms and so on. But it doesn’t mean they have people who can write APIs. Joshua Bloch has a wonderful presentation about designing APIs. The point is that it is intrinsically hard to write APIs and there are many ways you can get it wrong, especially if you don’t have that niche experience.
  • The huge amounts of data force them to make changes. The priority is to efficiently handle the data. After that come other considerations like the API. So if something doesn’t appear to fit in the API anymore, because it consumes too much resources, it gets changed. But that should have been anticipated, and hence API versioning could have solved this issue (return dummy values for older versions for example, instead of breaking everything). One clear example for this scenario comes from twitter, where the number of retweets is not always returned because of performance issues. That said, perhaps it is a bit easier to work with the API in a dynamic language
  • Company policy. Facebook’s main revenue is adverts and adverts are viewed on facebook.com. Not on other sites that consume facebook content. So not giving a fully-functional API makes sense. Users should still have incentives to go to facebook.com
  • Facebook code is rapidly changing and these changes get reflected in the API as well. Structures are changed, some pieces are moved from one place to another. Sometimes these changes makes some API calls stop working or work in a different way. In other words the abstraction of the API is leaking – we have to know what situation has caused an undocumented API behaviour. Although I have no idea what the code looks like, the API should ideally be a separate layer that consumes the internal code. So in many cases if the internal code is refactored, the API can stay the same with some modifications in the API layer, and it should not leak the underlying details (class structure, performance issues, etc).

Should facebook release another, better API? I don’t think so. But they should try to be more careful with this one. How exactly – their engineers should know better.

(This isn’t going to be a rant. I’ll try to figure out why the problems with the facebook graph API exist)

The other day an article got popular on dzone. It indicated the same problems that me and some colleagues of mine have have been experiencing, and confirmed my suspicion – it’s not just me – the facebook graph API is bad. Guess which is the longest method in the welshare codebase (my startup project where I’m using the graph API extensively). It’s called postToMessage and obtains all the data I need to display the a facebook message. All code analysis tools that I use report problems with it: “Cyclomatic Complexity is 37 (max allowed is 7); NCSS for this method is 78 (max allowed is 50);”. And this is the only method that violates all these. I could split it to reduce the cyclomatic complexity a bit, but the code will still be bad. Why? Because the facebook API forces the code that is using it to do “ugly” things in order to get everything it needs.

Here’s a (incomplete) list of what is currently wrong with the facebook API:

  • It’s a mess. For example there is the Post object which should contain everything about a wall post. But it is really tedious to extract information from it. See the table I computed in a previous posting. In short – anything can be in any field
  • It changes constantly. For a few days it just doesn’t fetch comments for some posts, for example. At some point you got the number of likes, another time you don’t. Not only the contents of fields change, but also the structure of the response. As a result API wrappers need to change as well.
  • There is no API version. This is something I’m amazed about. How come an API does not have a version? If there was, facebook would be able to make changes without affecting clients. Now your app just stops working
  • Some functionality is not available through the API. For example mentioning users or adding the “Share” button. This is basic stuff and should be there. Some functionality is provided by the old API. So it makes each project use two APIs

And this is supposed to be the new and better API that is replacing the “old REST API”. I guess I should be glad I haven’t used it extensively.

But why is it so bad? Here is a list of possible reasons:

  • Engineers are not good enough. Yes, this sounds impossible – the best engineers go to facebook (and google), right? But there are many different areas of software engineering. If facebook interviews are as computer-science intensive as those at Google then they have really great talent for writing complex algorithms, data structures, distribution mechanisms and so on. But it doesn’t mean they have people who can write APIs. Joshua Bloch has a wonderful presentation about designing APIs. The point is that it is intrinsically hard to write APIs and there are many ways you can get it wrong, especially if you don’t have that niche experience.
  • The huge amounts of data force them to make changes. The priority is to efficiently handle the data. After that come other considerations like the API. So if something doesn’t appear to fit in the API anymore, because it consumes too much resources, it gets changed. But that should have been anticipated, and hence API versioning could have solved this issue (return dummy values for older versions for example, instead of breaking everything). One clear example for this scenario comes from twitter, where the number of retweets is not always returned because of performance issues. That said, perhaps it is a bit easier to work with the API in a dynamic language
  • Company policy. Facebook’s main revenue is adverts and adverts are viewed on facebook.com. Not on other sites that consume facebook content. So not giving a fully-functional API makes sense. Users should still have incentives to go to facebook.com
  • Facebook code is rapidly changing and these changes get reflected in the API as well. Structures are changed, some pieces are moved from one place to another. Sometimes these changes makes some API calls stop working or work in a different way. In other words the abstraction of the API is leaking – we have to know what situation has caused an undocumented API behaviour. Although I have no idea what the code looks like, the API should ideally be a separate layer that consumes the internal code. So in many cases if the internal code is refactored, the API can stay the same with some modifications in the API layer, and it should not leak the underlying details (class structure, performance issues, etc).

Should facebook release another, better API? I don’t think so. But they should try to be more careful with this one. How exactly – their engineers should know better.