19 May 2013

Semantic Web and Linked Data Storage

Semantic Web often times is solely dependent on an efficient back-end storage and indexing strategy from where most of the processing stems. It seems leaving out the most valuable aspect of a Semantic Web architecture towards the end as a way of interface is a bad move. One should always first think through the data layer first. Semantic Web is like a work flow of services in a pipeline and have to be thought through in that manner as everything depends on resources and the active querying of such resources. In fact, by extending the model by way of linked data VoID interlinks one further extends the data requirements exponentially. 

There are generally three ways of approaching a back-end for semantic web. The first approach is to treat it as a pure W3C like a regular client-server model. The server being the triplestore and the client being the web interface of services. The second approach is usually apply a more granularity using property graphs with the Tinkerpop framework. In this manner a whole range of graph properties and options for NoSQL emerge. The third approach is to apply a standard relational model and to convert that into an RDF repository. In all three cases, an RDF interface layer similar to JDBC is required as well as possibly a search indexing layer. 

The two most common interface layers which also have their own storage layers include Sesame and Jena. Sesame is the more versatile of the two providing more robust features as well a majority of the triplestores are based on this model. Jena appears to be a more strict W3C driven approach. In both models, the provided storage is not sufficient for production requirements as the data can grow exponentially. One obviously has to keep room for current and future data needs. Often times clustering would be required to scale out the SPARQL queries. In almost all cases a read-only SPARQL endpoint has to be provided for users to interface with. In SPARQL 1.1 even an update and an insert has been added on. However, these particular methods should be restricted to admin level. 

Open source triplestores are generally quite limited for production use and so a workaround has to be applied at times to allow for scalability and storage needs. Currently, the top performing triplestores include Virtuoso, OWLIM, and Allegrograph both very much commercial and with quite a large toolset. The next best triplestore would be Bigdata which is a fairly good Open Source option providing clustering, sharding, and full-text indexing needs. It also has a zoo keeper connector. In terms of a property graph one can almost always use Neo4J or OrientDB. OrientDB provides a more liberal license option. Solutions that provide hadoop as the underline back-end storage layer will not perform very well due to the nature of its distributed design approach. The storage layer could be deployed to a clustered 64 bit and 4-8 CPU core production ready environment.

Semantic Web is really starting to take off and more and more interesting options are starting to emerge. However, it is still the case that open source solutions are lacking in production quality and are more experimental for research use. The field is still dominated by commercial players who provide a Swiss army knife of solutions in the field with an obvious premium. There is still a lot there to be done even in aspect of making Semantic Web more accessible for developers as the W3C specifications can be quite complex and in lot of ways there are just too many bewildering set of models to apply in a varied combination of usages. Perhaps, even the introduction of JSON-LD will facilitate the steps in making linked data more accessible for front-end developers. Simplicity and convergence is key in making Semantic Web the next evolution for Big Data and the Internet.

Java:
Sesame
Jena
Tinkerpop
linkeddataapi
any23
marmotta
stanbol
rdf2go
sesametools
groovysparql
pellet
owl-api
jsonld for java

Python:
Redland
RDFLib
Bulbflow
RDFAlchemy
Fuxi
Surf
ORDF
Django-rdf
Djubby
pysparql
sparta
Oort
sparqlwrapper


JavaScript/Nodejs:
RDFQuery
Tabulator

Semantic NLP:
KEA
OpenNLP 
DBPedia Spotlight
Maui

Graph stores:
Neo4j
OrientDB
Allegrograph
Virtuoso
BigData
Ontotext
Titan
Stardog

W3C:
SPARQL 1.1
RDF
JSON-LD

Reconciliation:
GoogleRefine

11 May 2013

Food Places To Avoid In London

In big city like London there is an unprecedented amount of food places on offer and with them comes a high risk factor with the degree to which such foods can be classed as edible to almost hygienically unpredictable. Often such places are made attractive with the bargain pricing factor. However, with that option also comes cheaper quality. It is left to the individual to decide whether quality can out way the price. In my opinion, one should never bargain on the quality of food like one would for an item of clothing or electronics. A lot of these places are made also accessible to the tourist or the person on the move due to their quick self-service options and their competitive pricing model. As an opinion, a few of the common places I have found to lounge in certain degrees of unpredictability and risk to falling sick the following morning are listed below. I hope one does not rely on such fast food places on a daily basis as that would quickly effect one's health over time. The choice of creating and cooking one's meal at home is almost always the best option.

  • McDonald's

They advertise so much on TV that one can almost get brain washed to paying them a visit. They also provide one of the most commercialized burger models around at an extremely competitive price. However, the quality of the food is on a very unpredictable side. Often times one will feel sick straight after they have had a meal almost like getting bad reflux. The food is also quite unhealthy if taken on a regular basis. Best to avoid and try some other more authentic burger places. Sure these other places may be more expensive but well worth it. 

  • Burger King

Another fast food place in competition with McDonald's. One would get a different style of reflux here often times is more jerky and a feeling of being bloated. Surely, an unpredictable place as well. There are so many other better alternative places for burgers around London. Does one really need to bargain on food and their health?

  • Cottage Chicken

A cheaper version of KFC, where the fries are unsavory and the chicken almost feels like its under cooked. KFC would be my advice or better yet get a whole chicken from some where like Waitrose or Sainsbury's. Free range is even better then such factory endorsed chicken fast foods.

  • Pret A Manger

One of the best places to obtain soggy sandwiches with some strange filling combinations which taste like they may almost be expired. If one just needs a quick sandwich while at work a better option would be make a homemade sandwich before heading off, that way one can control on what they eat, the cost, the hassle of standing in a queue, as well as on the quality. Another option might be to visit Paul instead.

  • Eat

Another competitor to Pret A Manger but with a little less quality. As if, Pret A Manger sandwich unpredictability was not enough. They also have a little less selection. Perhaps, one less set of options to choose for getting sick.

  • Certain Kebab Shops

Greasy kebabs shops seem to be appearing almost on every high street these days. They can range from Turkish kebabs to Middle Eastern roll up. Although, turning out to be tasty in the moment, the following morning may just be one of the worst days one ever had on a night out. This is not to say all kebab shops are bad but being vigilant on their preparation and serving often always can be one way of testing the waters before placing an order. Some places around London that are hot spots for a mix of unpredictable to really good kebab shops include the Edgware Road. But, one can also visit Best Mengal and Sofra which are even better in their own right. 

  • Bargain Bucket Style Asian Curries and Buffets

These are often one of those places where one can see and smell a lot of the food aromas out in the open. In fact, that is what they try to do as a way to sell cheaply and quickly. However, late night they basically trying to sell all of it out as it goes to waste. And, even in some of these places one can't be too sure as they may just reheat the food and serve it again the next day. Chinese and Indian foods are often unpredictable for the western palette especially down to the level of spiciness but also the way in which they cook it. Being vigilant in what one eats is the best way to go. Something that looks and smells good doesn't necessarily mean it will taste good let alone be healthy either.

  • Tescos

They serve value meals which are not very good but pretty much bargains in terms of what Tescos classes as a bargain and it is always best to check on the expiry date. My view would be to stay clear of such sandwiches, unless one doesn't really care about what goes inside their body.

  • CostCutter Style Sandwiches

Another one of those places where you get a mixed set of bargains with leftovers later at night. Baked items can be very greasy and the sandwiches something on the cheaper side pretty well at par with Tescos.

  • Tube Station Eat Outs

These places serve more expensive foods and lots to choose from. There is an extensive variety but some of the unpredictable eat outs are situated here also. It is not to say that they are all bad. One safe option for food is M&S they have a fairly good level of quality. Others that do work relatively well are places like Cornish Pasty. Also, would one really want to buy a sandwich from a place like boots where most of the stock they carry is in medicines or cosmetics.

4 May 2013

Subjectivity and Sentiments

Mining for subjectivity from text is a hard science requiring machine learning based approaches to harness information from the varied and large data. Subjectivity is all about finding opinions, affects, and sentiments from texts. This could take the form of processing blogs, reviews, tweets, editorials, and general articles as well as several other sources of textual content. Subjectivity is important in a wide range of domains. It can be valuable for calculating and identifying particular trends and forecasting for real-world applications such as in financial markets, fashion, events, economic indicators, social markers, political voting, chart toppers, and a lot more where understanding attitudes and feelings matter on a particular 'thing' or 'concept'. Automatically finding such information is hard and utilizing machine learning is an area of research analysis for identifying and extracting opinions and sentiments from texts. In order to achieve and develop such an application requires understanding and applying statistical natural language processing.

Sources of information on sentiment analysis are hard to find as well as the subject is relatively new. The below links are a valuable entry point towards further information.

Subjectivity Analysis
sentiment-analysis
opinion mining sentiment analysis survey
Twitter-as-a-Corpus-for-Sentiment-Analysis-and-Opinion-Mining
sentiwordnet
twitter-sentiment-analysis-training-corpus-dataset-2012-09-22
sentiment140

2 May 2013

Holographic Messages For Mobile

I think it be quite neat to have holographic message effects for mobile that could be sent just as one sends a text message only with a lot of 3D realism. It be quite similar to 3D TV only without the need to wear glasses and the effects would literally happen on the circumference of the mobile. Message mates can be funny, comical, creative, and at same time intimate as well as relay the mood between two individuals quite well without requiring the need to type wordy texts. They could come handy during all sorts of occasions from work colleagues, friends, family, to even a partner. A few examples would do justice here. On valentines day someone could send a virtual kiss or a virtual rose that rises up out of the mobile phone with maybe even a message attached. One could even send a virtualized gift on some ones birthday with a pop up clown or a stripogram. Another option could be for sending particular moody messages like when one is angry at someone they could send them a fire message that virtually puts their phone on fire and gives them a slight heated sensation on their hands. Or, perhaps one wants to know when they have received an email without the typical vibrator or a tone going off, they could set it so the phone feels like it is melting or freezing up. Virtualization on mobile could even allow for 3D conferencing, geomapping, flexible learning, remote security of car, home, family, and so much more. Even the idea of allowing people to customize their own effects would be pretty cool. The effects could even include an aspect of ambient intelligence. It be quite amazing to set one up for an android or even an iphone. It certainly doesn't seem impossible especially looking at the introduction of Google Glass.