Switching from LAMP to MEAN on my to-do app, Egotask

February 12, 2016 by Keith

Last month, I decided to change the technology stack of a project I started last year, something I've been meaning to do for a while. I've spent some time working with Node.js and Express as backend frameworks, and I'm very satisfied with the results. On top of the performance improvements, my main reason to make the switch was actually ease-of-use; Node with Express is a very intuitive framework, which matters for my productivity as a developer. Full support for Socket.io was also a game-changer since I needed to integrate web sockets for real-time communication.

The project I'm developing is a "to-do list" application called Egotask, which is currently built with AngularJS 1.0 on the client side. When users make changes to lists or tasks, the updates need to be saved on the backend in real-time, a common pattern found in modern user applications. After writing a solid backend foundation with PHP and Apache, I wondered if there might be scaling issues using a RESTful API to accomplish this.

HTTP Request Overhead

The API I built with PHP was my first attempt to create an endpoint-based CRUD system. For example, sending a POST request to '/api/tasks' creates a new task in the database and returns it to the client.

My concern was whether it made sense to use these HTTP requests for each task update, regardless of the size of the data. The following Angular controller uses PUT requests to the API to update tasks.

angular.module('Egotask')
.controller('TasksCtrl', function($scope, $http) {
  // load the tasks (normally you'd do this in an injected service)
  $http({
    method: 'GET',
    url: '/api/tasks',
  }).then(function(response) {
    if(response.data && response.data.tasks) {
    	$scope.tasks = response.data.tasks;	
    }
  }, function(response) {
    $scope.error = 'Something went wrong :(';
  });

  // submit changes; task must have an id
  $scope.edit = function(task) {
  	$http({
    	method: 'PUT',
    	url: '/api/tasks/' + task.id,
    	data: task,
  	});
  };
});

For the sake of this example, assume we have an input element in our view responsible for editing the task name.

<div ng-controller="TasksCtrl">
  <input type="text" placeholder="Task" 
    ng-model="currentTask.name" 
    ng-model-options="{'debounce': 400}" 
    ng-change="edit(currentTask)"/>
  <ul>
    <li ng-repeat="task in tasks" ng-click="currentTask=task">
      {{task.content}}
    </li>
  </ul>
</div>

The user can now click on a task and edit its name using the input box, triggering HTTP requests which send the update to the server.

Notice the ng-model-options="{'debounce': 400}", which delays the model from updating by 400 ms. This prevents $scope.edit() from being triggered too often as the user types while remaining reasonably close to live updating. But consider the overhead these AJAX calls are creating.

Each HTTP request sends several headers containing metadata and usually some sort of authentication to the server. For a LAMP server, it's generally a session token stored in a cookie that PHP validates. We would store the user's ID in the PHP $_SESSION variable to ensure the client is authorized to update a specific task. This provides some security, but there is a lack of efficiency. Do we need to repeatedly authenticate the user? Do we need to use up bandwidth by including the header metadata in each request? A better solution would be to hold a single, long-lasting connection after authenticating the user just once. That's where web sockets come in, which influenced my decision to integrate Socket.io to replace most of the RESTful API requests.

Socket.io

Socket.io is a simple yet powerful web socket library, helping to eliminate the unnecessary overhead of small updates by creating a lasting connection between the client and server. In my implementation, the initial connection is authenticated using a JSON web token to extract the user ID of the client. After establishing the connection, the client can push data in small chunks over the socket without re-authentication.

Let's modify the Angular controller above to use socket.io.

angular.module('Egotask')
.run(function(socketService) {
  socketService.connect();
})
// create a socket.io service
.factory('socketService', function() {
  var model = { io: {} };
  
  model.connect = function(token) {
    model.io = io.connect('http://localhost', {
    	'transports': ['websocket'],
    });
    model.io
    	.on('connect', function() {
    		console.log('Connected using socket.io');
    	})
    	.on('disconnect', function() {
   			console.log('Disconnected from web socket');
    	});
  };
  return model;
})
.controller('TasksCtrl', function($scope, $http, socketService) {
  // GET request for tasks stays the same ...
    
  $scope.edit = function(task) {
    if(socketService.io.connected) {
    	return socketService.io.emit('editTask', task, function() {
  			console.log('Task updated');
  		});
    }
    
    $http({
      method: 'PUT',
      url: '/api/tasks/' + task.id,
      data: task,
    });
  };
});

In the run block, we establish a web socket connection referenced in the socketService factory for later use. $scope.edit() now uses socket.io to emit the task object, which the server receives as an 'editTask' event. I choose to keep the original RESTful call as a fallback for browsers without web socket support.

PHP doesn't exactly support the web socket protocol out of the box – you have to use a plugin, such as Ratchet, and possibly run it on a separate server – yuck. Node on the other hand openly supports web sockets, and adding Socket.io to the server is painless.

const http = require('http');
const express = require('express');
const bodyParser = require('body-parser');
const port = 3000;
  
const app = express();
app.set('port', port);
app.use(bodyParser.json());
app.use(bodyParser.urlencoded({ extended: false }));
  
const server = http.createServer(app);
server.listen(port);

const io = require('socket.io')(server, {
  transports: ['websocket'],
});
io.use(function(socket, next) {
  // have some sort of user authentication here (omitted)
  next();
});

io.on('connection', function(socket) {
  socket.on('editTask', function(data, callback) {
  	// 'data' will be the task update sent
  	// perform database update here (omitted)
  	callback();
  });
});

Thread Blocking

In addition to request overhead, there's a problem with Apache servers that degrades performance as the number of incoming requests increases. Apache has blocking and synchronous I/O, meaning it can start to crumble with higher amounts simultaneous HTTP requests.

Imagine if we had 100,000 active users adding and editing tasks; there would be a massive increase in latency per request! You could technically upgrade the server hardware to allow for better multi-threading, but the paradigm overall doesn't promote scalability, at least for bursts of small requests. Each connection made to an Apache server is handled by a worker thread (or process depending on the configuration.) If a process is left waiting for I/O, such as a particularly slow database query, the whole thread gets blocked. Apache did however release a configuration that is non-blocking and event-driven, based on the Java NIO model.

Node.js to the Rescue

Node.js is server-side JavaScript which is non-blocking and asynchronous from the ground up, leveraging callbacks in a concurrency model. This is exactly the sort of setup needed to ensure that thousands of concurrent connections can be handled without latency issues.

Why didn't I rewrite the LAMP stack configuration to be non-blocking? At the time of writing, the V8 JavaScript engine used by Node.js has been shown to out-perform PHP in several benchmark tests. Even WordPress recently began writing an admin dashboard with Node, a platform which famously uses PHP for its backend.

Using Node also removes many of the conversion complexities as data flows between the client, server, and database. With the entire stack written in JavaScript, I can focus more on the business logic and worry less about the compatability between two languages. I also feel less error prone using a uniform language, which perhaps matters less for developers who work solely frontend or backend, but as a fullstack developer I spend equal time on both.

Next Steps

I haven't spent time discussing Angular's advantages and disadvantages as a client-side framework, but this is definitely worth exploring. In fact, Angular 2 plans to fix many of the performance, simplicity, and modularity issues of Angular 1, among other things.

Regarding Egotask, there is still much to be improved on the frontend; once I finish migrating the entire backend to Node/Express/Socket.io, I plan on shifting my focus to boosting the performance of the client-side web application.