Tuesday, November 19, 2013

Journey With JackRabbit

                        

   The way jackrabbit is supposed to work most of the time with versioning capabilities is as follows
    1.    A repository say "MyRepository".
    2.  Populate the repository with our data with proper tree structure by creating nodes and child    nodes.
    3.    To work with versioning , we might want to create a workspace and copy the nodes from          repository to the workspace
    4.    Work on the nodes in workspace and save them.
    5.    Once work is complete, merge the changed nodes with repository
   
   
   Data in the repository and workspaces can be persisted in following stores
    1.    Local DerbyDB
    2.    MSSQL
    3.    Oracle DB
    4.    PostgreSQL
    5.    MySQL
   
   
   Seems pretty straight forward. But the problem arises when the size of repository increases. Creating workspaces for a repository will remain a costly affair as nodes are copied. There are API in jackrabbit which allows to create a blank workspace and then selectively copy the desired node subtree.
   This might suffice and resolve most of the size and time needed to create a new workspace to work upon. But in some scenarios , we might need to copy the whole workspace because of dependency and the way application using the jackrabbit is designed.
  
   In such scenarios where selective copying seems not a appropriate solution one will have to think on how would we make the IOs fast.
   For such requirement  , use of NoSQL DB which stores and distributes data horizontally effectively increasing the overall IOs.
   Couple of such persistence managers are being thought of and are available in very native format.
   The underlying technology for persistence of jackrabbit data can be
    1.     OrientDB , a NoSQL graph and document driven database
    2.    MongoDB , a NoSQL document driven database
   
    Persistence Manager for OrientDB can be found at  https://github.com/eiswind/jackrabbit-orient . It stores the information in human readable format
   
    Persistence Manager for MongoDB can be found at http://svn.apache.org/repos/asf/jackrabbit/sandbox/jackrabbit-mongo-persistence/src/main/java/org/apache/jackrabbit/core/persistence/mongo . It serializes the information and stores in MongoDB
   
    On comparing the performance of LocalDerbyDB , PostgreSQL , OrientDB and MongoDB , it was observed that
    For creating the workspace for a repository with 3000 nodes on standard laptop
        LocalDerbyDB      : ~70000 ms   (PM and DS on derby DB)
        PostgreSQLDB     : ~60000 ms   (PM and DS on PostgreSQL)
        OrientDB              : ~55000 ms   (PM on orientDB and DS on PostgreSQL )
        MongoDB             : ~45000 ms   (PM and DS on MongoDB)
       
       
    So although MongoDB had overheads of serializing/deserializing, it performed best among the all the technology. There are performance improvement scope in OrientDB as well as MongoDB.
   
    Also MongoDB supports sharding and data can be distributed across horizontal node and retrival can be very fast using map-reduce algorithm.
    Only caveat about MongoDB , its GPL license.
  

Friday, October 11, 2013

Unlimited Possibilities with RaspberryPI

Everyone would want to have smart devices around. Smart devices are now capable of taking decisions reading internet information, reading information from most of the digital devices around and controlling them remotely over "Internet".  Many micro controllers and dedicated hardware devices are in use for doing the same. But wouldn't it be nice if we have a computer  with java installed instead of micro controllers. Micro controller   programming is "pain" (for me at least).
       Now enters RaspberryPI(http://www.raspberrypi.org/) . A very small computer. Running Linux from SD card. Bare minimum interfaces like LAN connection, USB connection and HDMI output. Decent hardware with ARM chip , 512 MB RAM and SD card slot.Powers using a USB port. Supports Oracle Java 7. Supports Python. A very good tutorial site (http://learn.adafruit.com/category/raspberry-pi) with lots of example including programming with Motion Sensor, Motors , LEDs and Relays. Best part is , its only of size of a credit card.
       Use case include Home Automation (https://code.google.com/p/openhab/) , teaching in developing countries schools  (due to low cost : $25-$35), helping people with disability , making a RaspberryPI cluster , robotics , unmanned vehicles  and making GPS sensors for pets. Endless possibilities.
      Cheap devices and pretty decent computing power (infact more power then of a desktop computer which i bought in year 2000) with portability makes it a very smart device which can be easily programmed, store data , transmit data over internet and read data from internet makes it a wonderful device capable of automating everything.  Possibilities are endless!!