26 March 2017

IT Skills Shortage

Many news sources state that there is a skills shortage in various sectors of IT. However, in close inspection this really is not the case at all. In fact, there is huge talent pool out there of candidates eager and willing to work. In most cases, for every job there are hundreds of applicants. Organizations are just looking for cultural fit. In other cases, one might have the skills but just not be someone that the employer gets along with. That does not mean that there is a skills shortage. In fact, it just means the organization is discriminating on the things they are looking for which are not even relevant to the role and the skills. In fact, outsourcing is a key factor in buying in talent on the cheap leaving local talent behind. Many human resource evaluations do state that the larger degree of the interview is about the person rather than the quantifiable skills that they can bring to the team. It is more about the likability which is a form of discrimination again people want to work with people that are more like them. This definitely does not imply that there is a skills shortage in the market. There are plenty of women out there that are very good in Big Data and Machine Learning but does one see them on their team. Employers are basically utilizing a wand of cultural nonsense and overlooking what really matters is the ability of the candidate to perform based on their skills as well as to innovate. Large organizations like Twitter, Facebook, IBM, Google, Microsoft, Amazon, and others harp on about skills shortage but they not focused on diversity and inclusion within their teams. They also buy in cheap outsourcing contracts from places like India ignoring the local thriving talent pool within the prospective regions. We need to recruit more women and generally people of all types and backgrounds in IT, supporting the local communities of where multinationals are based, and that starts from each individual that is well positioned towards evaluating applications on merit and objectivity. If people hire people, and people are generally not always objective, then perhaps, artificial intelligence needs to diminish the responsibilities of the human resources function in organizations.

25 March 2017

Future of Software Engineering

In world of Big Data we essentially have four separate encompassing roles: Big Data Engineer, Data Scientist, Data Architect, and Data Analyst. In most companies the differentiation of the roles generally has a distinctive overlap. However most of these role types are a summation of one main role as it were and that is of a Computer Scientist or more simply put of a Software Engineer. A Big Data Engineer ordinarily is a Software Engineer who is able to look at the big picture. This implies essentially that a Data Scientist, Data Analyst, and a Data Architect are really subsets of the Big Data Engineer. In future, we will witness the convergence of these roles into one where many of the separate role responsibilities will merge into the Software Engineer role. As memory requirements grow to meet Big Data it will mean that Software Engineers take on a broader scope of work. The many disparities in roles converging implies that complexity is more managed and accessible across a standard engineering team. Obtaining a Data Scientist with a Phd is no longer going to be the conservative forte for organizations. Many Software Engineers are able to do the work of Data Scientist as well as Big Data Engineers and manage to look at the big picture from the business standpoint. In almost all cases, algorithms can be taught, approaches can be taught, skills can be re-learnt, but really what teams require is the flexible mindset to adapt to change. Data Scientists have several limitations for organizations: they tend to have questionable programming skills at R and/or Python, familiar with specifically a set of statistical and/or machine learning approaches, and generally apply imperfect/overfitted models to a very small subset of data constraints which are not always adaptable to change in the realistic Big Data requirements of a business. They also have a hacking mindset and a very strong dependency on Big Data Engineers to provide the backbone for ingestion of data sources, refactoring, feature engineering, data pipelining, model scaling, and to ease their model building process. That is almost three quarters of the Big Data work. Ordinarily, Software Engineers can handle multitasking across the entire stack as well as practically apply data science concepts. The future means that the roles of Data Analyst, Data Scientist, Data Architect, Big Data Engineers will no longer be necessary as part of recruitment and they will eventually be eclipsed by the standard Software Engineer. It is not so difficult to attend conferences, read journal papers, conduct research, and build models on data. One only really needs the right business mindset or domain knowledge. Learning and applying new approaches is a continuous process of Software Engineers to up their skills in moving with the times. Organizations will demand more out of hybrid Software Engineers in being able to adapt for changes which are often guided by the business need and the data landscape. Applied Artificial Intelligence will imply that even the building of generalizable models for data is going to become a simpler process of engineering without requiring specialist skills.

Speech Recognition Toolkits

Kaldi
Moses
SRILM
Sphinx
HTK
Simon
Julius
Tensorflow (WaveNet)

Baconjs

Bacon.js

Crunch & Scrunch

Apache Crunch

Cybersecurity with Spot

Apache Spot

Swarm Bandit Robotics

Border control is an issue for most land locked countries including ones that have direct access to water ways. Being able to control every aspect of a border is not humanly possible in most cases. But AI can in fact help cover a larger distance with greater amount of force and attributed control. Artificial Intelligence in form of swarm intelligence and reinforcement learning can create an effective force for border security and control. Building a great wall is pointless and overly expensive. Eventually the wall comes down. But, an army of drones and robots, if well engineered, becomes a force to reckon with as well as adaptable to patterns of attack and infiltration. The ultimate goal being an autonomous army of swarm robots that can apply tactical understanding and manoeuvrability across the entire map of the country, advanced in strategic alliance and combat, when necessary, to protect the sovereignty of a nation and its people.

Military Swarming
Kilobots
pentagon drone swarm autonomous war machines

24 March 2017

Cultural Fit

Cultural Fit is a shield many organizations use to cover up a set of attributes they look for that are intrinsic to their company. But, if one inspects the concept closely it pretty much makes it transparent that the notion is really a form of discrimination and technically where such biases are illegal in the workplace not to mention in secular societies. Could this be the reason why diversity is still an issue in some organizations? And, one reason why there are so few women in technology roles today? There are two attributes that drive a business: performance and innovation. And, it really is these two attributes that should drive the recruitment process. It seems many companies are still hung up on the idea of cultural fit, end up recruiting individuals that are too similar to each other, possibly treat their roles as a daily grind where innovation gets inhibited and the enjoyment of work becomes a foregone conclusion. The glue that is holding a team together drifts apart eventually from frustration and an exceptionally high turnover of employees emerges within an organizational dynamics all with the view of maintaining some abstract concept of a culture. No two people are ever alike. Such concepts of cultural fit are implemented and developed by human resources turning into an almost factory concept of workers. But, if one inspects the function of human resources closely do they really understand how to recruit the right people if all they will do is hunt for keywords on a CV and whether the candidate acts a certain way? In most cases, the entire human resources function at organizations can be replaced by artificial intelligence and data science. Many companies have already given up on the nonsensical idea of cultural fit, favoring more towards diversity and inclusion. Perhaps, it is time organizations moved on from nonsensical approaches to recruitment, developed better, more objective means of evaluation, and focused more on what really matters as relevant to the role. Performance is a major driver for business. What does it matter if someone fits in or not? They could just as easily be lazy at work, bored out of their wits with the repetitive grind, and still pretty much fit in a bubble of a large organizational dynamics which really does not bring any quantifiable output to the business other than another salary or wage earning employee that could become a victim for redundancy down the road. We do need to inspire, encourage, be more accepting of people that are not like us, especially for a balanced workplace and for maintaining a healthy inspiring energy for innovation. All this is important for the survival of a business and to continuously maintain a competitive advantage. Also, is it any wonder that history repeatedly shows that it is really the people that don't fit in are really the ones that eventually become the real success story in society. Why are we so hung up on maintaining the status quo in every form of society when even time eventually brings about change?

23 March 2017

Entrepreneur First

Majority of investors associated with Entrepreneur First expect you to have a Phd or collaboration with someone who does have it. The funny thing is most of these investors had previously worked for companies that had become successful by founders that were college dropouts. Have there been many companies that have become successful by a Phd holder? It is very rare. Most innovators in past and present have been people with no-Phd, not even a masters, and in many cases were simply college dropouts. But, they still managed to build a successful business, engineer an innovative product, and gain investment. Looking back at the likes of Google, Microsoft, Apple, Oracle, Dell, Paypal, Facebook, and others, would any of the investors of this generation have asked back then whether they had a Phd before they received investment? Invariably, an individual with a Phd likely will not have the mindset towards taking risks and building a product that can actually sell. And, if one needs to build an entire team around them then really the value of a Phd becomes fairly redundant. Why do investors have such double standards? And, the fact that people with ideas can sometimes struggle to find investment. Perhaps, it all boils down to risk and the gamble they have to make with a potential idea as well as their return on investment. Funnily enough, 80% of all academic research amounts to nothing and holds no real qualitative or quantitative value in terms of substance and real insights. Businesses don't make any money from publishing papers, they make money from selling a viable product. In fact, in most industry sectors, innovation in organizations, is often driven by non-Phd people with many years of practical experience who have been able to spot gaps in the market or problem areas for which there is a high customer demand for solutions. The narrow-minded attitude of investors does not pay dividends and it certainly does not help any potential individuals with a viable product that may be seeking investment. However, such is the conservative attitude of many investors today either a) one has a phd, b) one has people with phd individuals on the team no matter how practically useless/clueless/inexperienced/unprofessional/unethical they may be, or c) one has a collaboration with a large multinational. Otherwise, one might struggle to convince an investor to buy into the innovative idea. Sometimes, all it takes is an investor seeking potential to invest in the individual (likability factor) and not just the product as a sum package - for that there are better alternative avenues for investment. 

Entrepreneur First

Hackathons

Hackevents
UK Hackathons and Jams
hackathon
hackathon guide
techsoc

11 March 2017

Select Papers

F1
Megastore
Spanner
Recursive Deep Models
Drill
Random Forests
PageRank
Useful Things To Know About Machine Learning

Insurance Ontologies

Insurance ontologies are useful for building a semantic decision tree in various context of policy, claim, as well as providing the right insurance for a potential member. They also build formalisms towards a knowledge representation on basis of which decisions can be made towards policy and claim decisions. There are an ever growing types of insurances and even further in the different types of providers with their individual policy features. Semantic discovery of the right insurance product with the right policy features for a policy holder is also important towards a full alignment of coverage and the right balance of risk premium. Another place where such ontologies can apply is as a thesaurus service of terms for semantic extraction and understanding of an insurance document often in form of a SKOS schema.

Example List of Some Attributes for Insurance:
  • Insurable Object Id
  • Insurable Object Type
  • Insurable Object Name
  • Insurable Object Detail 
  • Insurer Id
  • Insurer Name
  • Insurer Product Id
  • Insurer Product Name
  • Insurer Product Type
  • Insurer Product Detail
  • Member Id
  • Member Name
  • Member Type
  • Member Detail
  • Organization Id
  • Organization Name
  • Organization Type
  • Organization Detail
  • Agreement Id
  • Agreement Name
  • Agreement Type
  • Agreement Detail
  • Claim Type
  • Claim Id
  • Claim Amount
  • Claim Folder
  • Claim Document
  • Claim Offer
  • Claim History
  • Claim Status
  • Claim Link
  • Claim Decision
  • Party Role
  • Assessment Id
  • Assessment Result
  • Assessment Score
  • Fraud Assessment
  • Policy Type
  • Policy Id
  • Policy Name
  • Policy Coverage Detail
  • Policy Premium
  • Policy Condition
  • Policy Term
  • Policy Member
  • Policy Duration
  • Policy Pre-exist Condition
  • Policy Sub-Members
  • Policy Risk Score

Digital News

News is a central cornerstone of society which allows one to be kept informed of events of local and global scope across a plentitude of topics. Paper newspapers are a foregone conclusion as many are now digitally available online. However, plenty of news services are still very much untapped and news outlets still depend on conventional approaches. Also, accessibility of news should be available to all. Although, this has been made possible to a degree with social media. The below provides some ideas of extending into a new generation of news services to provide for objectivity as well as allowing untapped potential for news gathering, semantics, analytics, and reporting.

  • real-time bidding of news
  • submission of news by journalists (but one doesn't have to be a journalist to provide news of significant value i.e crowdsourcing of news for all)
  • return on investment for news sites and writers
  • analytics on news
  • seo of textual and image content through natural language processing and deep learning
  • scoring news for influence, trust, and authority
  • digital news gathering and discovery
  • circulation graph
  • alerting and viral monitoring
  • sentiment analysis
  • defining knowledge actors/agents for: anchor, journalist, reporter, writer, editor, correspondent
  • event/story timelines
  • automated/assistive headlines and summaries
  • cloud based services and api
  • news content publishing
  • collaborative and content filtering
  • increasing readership and news consumption
  • news curation
  • story semantics and connected news sources
  • identifying fake news
  • news accessibility
  • firehose of automated generation of news feeds
  • queryability for connected news in a linked data graph of stories
  • prevent risk events through proactive text-driven forecasting and reporting

9 March 2017

Google Addons List

Mail Merge
Scheduler
Email Extractor
Phone Number Extractor
Name/Address Extractor
Advanced Spam Filter
Digital Signature Maker
Digital Signature Protector
Signature Recognizer
Fraud Recognizer
Drive Auditor
Attachment/Link Scanner
Malware Scanner
Bulk Forward
Bulk Send
Contacts Update
Contacts Map
Purging
Advanced Filters
Self-destructing messages
Read Tracker
Auto-Expire Attachments/Links
Unsubscriber
Auto-Responder
Draft Templates
Recall Message
Regex Evaluator
Task Manager
Docs Filters
Personal Assistant
Diary Manager
Micro eCards
Message Resource Lookup (SPARQL)
Send Guard (don't send to unintended recipients by mistake)
Smart Rules
Mail Protector
Inbox Auditor

Gmailify
Gmail Learning Center
Google App Script

2 March 2017

Optimize Performance of Advertising Campaign

Kaggle Display Advertising Challenge

Scalable Machine Learning

Reasons for why scale machine learning:
  • training data doesn't fit on a single machine
  • time to train model is too long
  • too high volume of data that is coming in
  • low latency requirements for predictions
How to spend less time on a scalable infrastructure:
  • choose the right ML algorithm that is fast and lean that is able to work on a single machine accurately
  • subsampling data
  • vertical scalability
  • sacrificing accuracy if it is cheaper
Horizontal scalability options:
  • Hadoop ecosystem with Mahout
  • Spark ecosystem with MLlib
  • Turi from GraphLab
  • Streaming Technologies like Kafka, Storm, AWS Kinesis, Flink, Spark Streaming
Scalability consideration for a model-building pipeline:
  • choose scalable option like logistic regression or svm
  • scaling up nonlinear algorithms by making approximations
  • use a distributed infrastructure to scale out
How to scale predictions in both volume and velocity:
  • Infrastructure that allows scale up across the number of workers
  • Sending same prediction to multiple workers and returning back the first one to optimize prediction velocity
  • choose an algorithm that can parallelize across multiple machines
A curious alternative for Hadoop for scalability is also Vowpal Wabbit for building models on large datasets without the requirement of a big data system. Feature selection also comes in handy when one wants to reduce the size of training data by selecting and retaining the most predictive subset of features. Lasso is a linear algorithm that is often use for feature selection. In respect of prediction velocity and volume, scaling in volume means being able to handle more data while scaling velocity means being able to do it fast enough for a use case. One also has to weigh out the sacrifice between speed and accuracy of predictions.

GraphLab Turi

Turi

Commonsense Reasoning

Text Summarization with Deep Learning

GlusterFS

GlusterFS