28 December 2020

Python Dependency Management

pip 
pipenv 
pip-tools 
poetry 
micropipenv
virtualenv

19 December 2020

Relationship Conundrums

Is a woman's place in the kitchen? In many traditionalist societies, it is expected that a woman's place is in the house after marriage. Which is not only odd but it rules out the aspect of equality between a man and a woman. At the start, a man expects the woman to stay the same as she is before marriage and after marriage. On the other, the woman thinks about changing the man after marriage. This thinking dynamics causes all sorts of issues. The man is unlikely to want to change after marriage. But, invariably, women do change after marriage. The person the man wanted to marry is no longer the person they married anymore. However, an educated woman that decides to take on the role of a housewife after marriage, especially in western society, will feel like she has lost everything in her life. Everything that she aspired to achieve is no longer available to her as an option. Unless, obviously if she aspired to be a housewife, that is another matter altogether. A woman spending eight to ten hours stuck caged in a house while the man is out working is bound to create boredom and frustration. When the man comes home after having spent long ours at work, he will spend it at home or do whatever makes him relax. The woman will want to go out. This is likely to lead to friction in form of nagging, whinging and most of all heated arguments. The woman will feel that the man does not respect or understand her. The man will feel the same way towards the woman. If this continues, bonding between the two never forms and they become distant overtime. The woman might feel having children might solve the issue, but that only adds to the man's financial burdens. In fact, it also leads to health issues especially as not only is the man working long hours but it becomes another shift of work when they come home and have to take the wife out. The woman may even use food as a source of releasing her frustration leading to back and forth dieting. While the husband will spend more time away in the workplace trying to avoid the confrontation. Eventually, the woman stops looking after herself, with low self-esteem, grows less and less attractive for the man. While the man eventually completely avoids interacting with his wife. If we take this dynamics and reverse it a completely different story emerges. If both men and women are working they are starting their relationship at an equal footing. Both men and women have a lot to talk about their day towards the end. They both can give space to each other because they know what it is like working long hours. Both of them are able to build a sense of respect and understanding towards each other and develop a bond. The woman now has a purpose other than sitting at home. And, they both can find valuable time sharing with each other in doing activities outside of work. The woman no longer feels that marriage is the end of everything. In fact, it becomes a beginning of something good and nurturing. They both can share their responsibilities of doing the chores and financially contribute towards bills. Man is no longer the only breadwinner, nor the woman the only homemaker. Married life really only works when we remove the assignment of gender roles between two people who can first start out as friends (respect, understanding, support), then partners (love, trust, consideration) then husband/wife (commitment, responsibility, communication). However, in western society, marriage no longer has any real value nor is it necessary. Invariably, infidelity seems to be considered as an established norm. But, this rarely provides for a meaningful relationship. If the initial foundations of a relationship are weak it will hardly lead to much down the line. And, loyalty usually goes amiss if there is something missing in the relationship that hasn't been resolved from the start which brews into further issues down the line. In west, caring and sympathetic nature towards the other is rarely considered as people tend to be brought up in a selfish individualistic mindset where there is lack of patience and minor arguments can turn into major issues. And, counselling only makes things worse because bringing in a third-party is not only felt as intrusive, it is likely that mediator also has relationship issues of their own. Humans are not perfect nor should we be expected to be in society. Dynamics of society may bring their own environmental issues into the mix. It is how we deal with the ups and downs, as mature individuals, that determine the relative foundations we have in a relationship and whether they can withstand the test of time. In fact, as time goes on the spark that two people had at the start of a relationship may over time change and turn into something completely different which is inevitable. Perhaps, enriched and blossomed with something even better. Or, something that dies away and withers in time. As they say, it takes two to tango and the degree to which two people are willing to go to make it work. Psychology of relationships is an interesting area for modelling artificial intelligence and to decipher the many solutions to dynamic issues. Marriage counselling through artificial intelligence can mean a lucrative solution within the confines of privacy. In fact, why must people visit a human especially as humans are born imperfect which is not only inconvenient but also uncomfortable for many.

11 December 2020

Should You Use Flink

Flink is currently a very unstable platform. They have re-instituted the FlinkML which is unstable. They have rebalanced the graph option and the introduction of table. Any stable work now really depends on Spark. The Flink team really need to make up their minds and get their heads around stream processing and the abstracted features they want to provide to the stack. In fact, the Python option is just riddled with bugs. Perhaps, waiting a while might make the entire platform more stable but that is dependent on the goals of the team in the near future. Even the documentation is going slightly pair shaped. When a core aspect of a platform changes, it is best to fork it into a completely separate project. However, this fundamental shift, is what has made the Flink platform so unstable and the documentation untrackable. Maybe, in near future something better would come along to replace Spark and Flink that is ready for commercial use. But, so far it seems Spark is the only real contender in the market, albeit slightly unstable in its own right providing sufficient amount of flexibility without the added frustration. 

8 December 2020

Amazon Data Centers

Amazon Data Centers

Lucidea

Lucidea

Mondeca

Mondeca

Data Harmony

Data Harmony

MultiTes

MultiTes

PoolParty

PoolParty

TopBraid

TopBraid

Synaptica

Synaptica

Wordmap

wordmap

TemaTres

TemaTres

ThManager

ThManager

VocBench

VocBench

CoreOn

CoreOn

4 December 2020

BioMedical Data Sources

  • PubMed
  • PubMed Central 
  • BioMed Central
  • PubChem
  • Medline 
  • ClinicalTrials 
  • OBOFoundry 
  • Galen 
  • SNOMed 
  • UMLS 
  • ON9 
  • MeSH 
  • ICD-10 
  • GO 
  • BioPortal 
  • CARO 
  • DO 
  • FMA 
  • HPO 
  • IDO 
  • MGED 
  • MP 
  • OBI 
  • OCI 
  • OGMS 
  • PATO 
  • VO 
  • Meddra
  • RePorter
  • Toxline
  • BioPAX
  • DrugBank
  • Uniprot
  • NCBI
  • BIOGrid
  • CellMap
  • ChEBI
  • ChEMBL
  • DailyMed
  • Diseasome
  • HapMap
  • HomoloGene
  • HPRD
  • HumanCYC
  • HumanPhenotypeOntology
  • IMID
  • IntAct
  • MINT
  • NCBI Gene
  • NCBI Nature
  • PBD
  • Pfam
  • Pfam-A
  • Pfam-B
  • Reactome
  • RxNorm
  • SIDER
  • SymptomOntology

Why GCP Sucks

GCP is one of the most unreliable cloud solutions from a so called reliable provider - Google. In fact, so many services are a bad reflection of their lack of technical ability which operates like a third-rate product initiative. The entire platform is not only rigid but lacks sufficient variety of service offerings to match the multitude of domain applications. In fact, it cannot be stressed enough that the entire platform runs like an experiment where the services are at best buggy and over-priced. For any architectural decision, it will behoove you to wonder that it can most likely be done better on AWS and with a more flexible pricing option. In fact, there are no shortage of engineers and architects that are sufficiently qualified on AWS in the market to help an organization in their ramp-up time. GCP in many respects also lacks sufficient compliance and governance features. And, sudden robotic alerts are common to reek untimely havoc into your otherwise smooth running systems causing abrupt shutdowns, no advanced warnings, and extremely short three day violation messages. Online support is virtually none and trying to get hold of someone is extremely difficult - lack of human element is not only bizarre it spells unreliable alarm bells during critical outage situations. Most of the AI services, especially natural language related, are at best terribly executed and accuracy is atrocious. Everything on GCP is a bad reflection of what internally Google is really like: arrogant, disorganized, and overly bureaucratic where they forget about the customer's needs. In fact, even the monitoring service is terrible. Most of their databases focus on SQL, aren't we in a world where NoSQL is the norm? Google has always been terrible at understanding semantic graph concepts, literally everything they touch turns into a probabilistic problem of approximations. Where it lacks in quality and variety of services, it makes up for in devops innovations. It may be a good option where applications are still experimental and in the development stage. However, the reliability is so bad that it leaves little reason to build anything on the platform that may one day be for production use. By the time application is ready to go live, the ongoing frustration will drive one nuts on the platform to want to switch to AWS. 

EMVco

EMVco

24 November 2020

Hybrid Methods

Combining semantic rule-based methods with machine learning are the best hybrid methods for optimal contextualized results. A pure machine learning solution is rarely going to understand semantics of data, and will always return some form of confidence score as an approximation with no exact results. A pure semantic method will provide exact results based on inference and reasoning constraints. A semantic solution is also logically testable through standard programmatic methods. A machine learning solution can at best be evaluated on approximates for which formal explainability and interpretability methods are required in the process for compliance. Semantics can take the form of ontological representations like knowledge graphs and commonsense reasoning methods. When both approaches are combined there is an increase in connected context and semantics which is beneficial for formal artificial intelligence driven systems. It also provides for better transfer learning and a way of managing governance. Such methods when combined add significant accessibility value to business in form of natural language interpretability, integration, feature engineering from standards compliance, and for human-computer interaction interfaces. In essence, data is transformed into knowledge and information through the enrichment of machine-interpretable context that can grow through the mechanisms of self-learning, self-experience, and inevitably develop self-awareness about the environment. The semantic aspects act as a simulated form of associated memories that are available for forming new data associations and relationships. Such memories are formed through aspects of persistence in a knowledge graph that could be treated as in-memory cache for short-term processing and storage for long-term processing. In process, the machine learning and semantic methods feed of each other to increase in learnability and comprehension about the domain targets within the aspects of an open-world. The entire world wide web is based on the aspects of semantic resources making the internet accessible for browsing, searching, and findability. Metadata is in every technological software and hardware in use today. However, such metadata requires semantic enrichment to enable machine-interpretability which can be achieved through semantic standards. In fact, many programming languages are built using similar theoretical underpinnings of compilers, interpreters, parsers, semantics, and syntax. Many of these methods for decades have shown to be sufficiently plausible in industrial business use cases. In many respects, these are similar aspects, albeit in simpler abstractions, to human intelligence processes that are far more intricate and complex in nature. Rarely, do humans think in statistics for pattern matching and recognition where many of such processes are driven through semantic associations and are reinforced through experience. A pure statistical machine is always going to be fairly unsure about the world if based on approximations. The semantics will give it an edge to formulate meaningful associations from the world through domain relevant experiences which it is then able to interpret and analyze in context. 

20 November 2020

Why do you need a Phd

A Phd in all intents and purposes adds little value for business. Although saying that, pretentious investors do look for Phd caliber individuals at startups in order to secure funding. In such cases, the hypocrisy can be seen in the background of such investors who may have even worked for companies that were founded by college dropouts. Publication of research papers does not generate any revenue for an organization either other than to gain notoriety. Considering that 80% of all published research work amounts to nothing that is a lot of investment in wasted time and man hours. By the time, a Phd candidate completes their dissertation work, it is already outdated to be of any significant real use to industry. A Phd work can last anywhere between 3 to 6 years, considering market movements and the advancements in technology in industry, apart from outdated practical skills they would have very little to offer at time of graduation.  In many cases, an entire support function needs to be developed by an organization to cater for Phd individuals who will even need help in refactoring and scaling their work. In most cases, they will need a lot of mentoring and training for all the basic skills that they should have learned in academics that a practically applied individual with work experience already would have to offer an organization. It is questionable what quantifiable work they produce if additional resources are needed to make it of any use to business. Invariably, theory does not supersede in practice. In many cases, theory may not be plausible to implement in practice due to uncertainty and complexity reasons for which many Phd individuals have very little experience of outside of academics. And, putting them in line of influence on business product initiatives is a big risk as they come with very little practical experience. Academics is very different from how things get done in the practical world. Who really is defined as a domain expert? Is it one that has studied a topic for decades with published papers in a sheltered environment or one that has learned the art through delivering practical projects across industry domains? In fact, as more clueless Phd people find work in industry this has driven a shortage in academic institutions. This is also an indication of how bad teaching is in academia and how out of touch it is with the complexities of the real world. Considering, only one percent of the world educated population holds a Phd, it is hardly wonder they won't be missed much in industry. Perhaps, people with Phds should really stay in academics where they can teach (lack thereof to improve their teaching skills) and publish papers (lack thereof to improve on quality research) within the confines of a protected institution and leave the practical aspects to the experienced experts in business.

13 November 2020

Why do you require a Data Engineer

Companies need to fundamentally ask the question - why do you require a data engineer on your team? The role of a data engineer spans 80% of the data science method. If a data engineer is expected to work alongside data scientists then there is a valid assumption one can make about the incompetence of such data scientists especially as they are only able to fulfil 20% of their job responsibilities as per the data science method. The next follow up question naturally arises - why do you require a data engineer on your team if you already have a team of data scientists? A competent data scientist's job is to be responsible for the entire end-to-end data science method which doesn't mean just building the models but also to pipeline and scale them so they are of use to the company.  In fact, they also are responsible for doing their own feature engineering. Otherwise, this begs the question - how can one build a model without doing the feature engineering themselves? And, it further begs the question, that if feature engineering work is done by someone else other than the one building the model, it is highly likely that it would lead to an overfitted model solution that only partially solves the business case. This means there is no real need to hire a data engineer if there is already a team of data scientists. In fact, companies can significantly reduce their hiring costs by just hiring people that know and understand the data science method. Either only have a team of data engineers or only a team of data scientists as it makes literally no sense of hiring both in a team. The next follow-up question could be - if you already have data scientists on your team, then is there any point in hiring a machine learning engineer? With so many role overlaps it only spells more and more incompetence and clueless team members not to mention frustration in hiring so many people. Companies can save on costs significantly by hiring capable people that have the sense to understand their job functions and have the relevant practical skills. Furthermore, one can then proceed to ask the next question - why do you require a data engineer, a data scientist, and even a machine learning engineer, if you could just hire a team of AI engineers? Machine learning is part of AI, data engineering is part of data science function, data science is pretty much what AI engineers can also do. And, to be able to act as AI engineers you have to understand aspects of computer science. Which leads to the last and final question - ultimately don't they all require a competent software engineer with practical skills that can deliver solutions for business? Invariably, what we find in industry is that the weakest link is the data scientist, and this is usually because companies are tarred with recruiting Phds who have no practical skills apart from their academic theory that adds little quantifiable value for business in delivering products, they neither come with software engineering skills nor do they have the applied experience mindset to appreciate it. Precisely, why so many data scientists in industry are only interested in building models, because in academics, they are rarely taught the importance of feature engineering, of how to build a scalable pipeline, nor about the entire data science method. Essentially, the data engineer role becomes a gap filler for all the inadequacies and deficiencies of an incompetent data scientist resource.

2 November 2020

Are Men and Women Equal

Mathematically, two things are equal when they are exactly the same and when they represent the same object. Equality is symmetric, transitive, and reflexive. For men and women to be equal, they would have to be identically the same in attributes. Biologically, men and women are distinctly different. Men cannot give birth. Men cannot get periods. Although, physically men and women may look similar to some degree. However, mentally and hormonally they may also differ. Essentially, they have different attributes that define their aggregate makeup. Generally, laws of most countries, on paper, only recognize two genders male and female, so we can reduce the complexity here from infinite genders. Since male has XY chromosome and female has XX chromosome, for female to equal to male, X would have to equal to Y. The laws of equality dictate that if x = y, then y = x. That would invalidate the existence of two sexes. For a woman to exist or continue to exist, one would need XY for reproduction and variety assuming the first woman born cannot be born pregnant. In order for the population to grow, there is a need for XY chromosome. Although, this is quite an oversimplification from the various mutations that can occur. Lets assume X = 1, and Y = 2. In which case, through addition, X+X = 2 and X+Y = 3, 2 is not equal to 3. Lets assume the second scenario of X = Y, Y = X. In which case X+X = 2 and X+Y (which can be replaced by X) = 2 but this is not possible because that would mean both are women under the laws of equality which would refute the claim for having two sexes man and a woman. The alternative also does not hold when X is substituted for Y in which case both cannot be men. But, since men and women do exist, equality cannot hold. Precisely, why a woman is called a woman, and a man is called a man, because the two are dissimilar. In fact, one can substitute addition for multiplication and derive at the same answer. Which would imply that Y cannot equal to X, therefore XX cannot equal to XY. Invariably, in society we still find women expecting men to pay after them and even to the point of bending the knee for a proposal. Rarely, do we see a woman bending the knee to propose to a man. Women often take pride in the fact that they can multitask, while at the same time making the implication that men cannot. By making such sweeping statements, perhaps, they don't realize they put themselves in a self-defeating corner of any justifiable evidence for equality. Also, stating that biological males should not be competing against biological females automatically dismisses any argument in support for equality. Separate public bathrooms for men and women also disproves the notion of equality. Laws in most societies are there to protect women in terms of abortions, custody battles, alimony, and child support. When it comes to abortions women talk about "my body, my choice" and ignore the rights of the father. However, when the child is born, suddenly the rights of the father become apparent to the woman when she demands child support. Such expectations clearly do not define equality. However, what they do display is hypocritical double standards. Hence, they may have different needs but similar wants which is what society defines as equality. An obvious case can be found in their differential needs in a typical supermarket where women personal care aisle is separate from the men personal care aisle. If they were equal they would both share essentially the same needs in personal care products. That would imply that emotions aside, logically men and women are two distinct types of humans that cannot be equal to each other. In fact, one can go further, and make the assumption under the laws of equality for the cases of both men and women that they are essentially defined as unique individuals with uniquely defined attributes that measure the summation of their identities. No two men are ever alike. Just like no two women are ever alike. We are all unique in our own way. In fact, even identical twins eventually develop variants. In multi-faceted societies, gender roles often are affected by environment factors that go beyond economic, social, and cultural divides that influence the makeup of individual personalities and identities.

1 November 2020

Linear Annotations

In theory, annotations can be implemented in multiple different ways. However, in practice they almost always tend to be linear. Language is hierarchical therefore not very linear. In order to programmatically apply non-linear steps it would mean multiple dependency points, lots of side effects, increase in complexity, and a requirement of state. In fact, it also means the annotations become dynamic rather than static. In big data terms, this is not only a huge computational cost, it also means many raw sources will fail annotation process and put the entire pipeline into a sudden grinding halt as the data sources and annotations grow. In fact, with just only two to five annotations and thousands of data sources the computational requirements can be exhaustive and time consuming. In practice, annotations can vary from anywhere from one annotations to thousands of annotations. In media and publishing, annotations can be so huge that they are provided with their own numbering system against a set standard. There are fundamentally a few abstractions in a callable annotation process which may include: model, score, label, annotation, and metrics. Invariably, there is also a frontend component that talks to the backend model components as well as an evaluation and an adjudication process in order to validate and mediate a corpus production. A separate method may be applied to quality check the human annotations for example via active learning and predetermined evaluation metrics. The model can be linear or non-linear. However, the annotation process itself in a pipeline is almost always linear in nature. An example of a basic linearly defined and aggregated parsing steps might be an Entity Recognizer. One can utilize annotation tree structures on frontend, but this creates complexity issues at the backend model process from an input/output perspective. In fact, some model steps may follow a serially defined dependency where one model process leads into another model process step. In the cloud, data tends to follow an immutability constraint both for process as well as storage. In industry, as a result of huge cost in exponential complexity, not to mention an increase in errors with processing large number of data sources, non-linear structured annotations have not received widespread adoption outside of the scientific research community. It seems also plausible to highlight that people that suggest to businesses to use non-linear annotations, as an alternative, are likely inexperienced to understand the practical complexities that come with such architectural issues, in production mode, unless they can provide at least one successfully deployed productionized pipeline, in industry, that used non-linear annotations and the associated metrics to backup their claims. In fairness, it may just be their confusion, for something that they may regard as non-linear may just be an aggregation of linear annotation steps.

24 October 2020

Ethicists Are Unethical

Ethicists are not very ethical individuals. In fact, they are in a profession of knowing what is right and wrong. Perhaps, it is this confidence at knowing what is right and wrong that makes them less likely to act ethically in real life. In many respects, an ethicist is the most likely hypocrite because even after knowing something is wrong they are willing to commit the act. They are often seen preaching for the right things, but hardly applying any of it in practice in their own life. Furthermore, AI ethics is doomed if a human ethicist is relied on to develop the guiding principles as they are naturally unqualified. It is this feeling of entitlement that makes one consider doing something bad. They are also most likely to suffer from the god complex. Ethical and moral judgement is clouded in human nature. Do we really need to implement and replicate this flawed human nature in AI? In many cases, regular introspection and self-reflection are fundamentally important aspects of ethics and morality which may need to be extended into the generalizable AI machine. 

9 October 2020

Is There Really A Skills Shortage

In most cases, in industry, a skills shortage, invariably does not exist. There is always a case for more supply of skills than there is demand for them. Hence, why in many countries there is always a percentage of unemployment. People are willing and able to work. However, the problem stems from the fact that employers filter out perfectly good CVs. This may be a result of their own biases, their sense of likability, as a result of keyword hunting, their need to want more skills for substantially less pay, or the fact that they don't care to read the full candidate profile. In many organizations the first people that get involved in filtering CVs are non-technical individuals who have no understanding of the skills nor the context of how to use them. By the time the hiring manager receives the profiles they have already been whittled down through recruitment bias. While through the interview selection process even further bias is applied and in turn the person they decide to recruit may not necessarily even be the best candidate in the pile of CV applications. For many roles, a recruiter may receive anywhere from one to hundreds of CVs of which many are likely to be suitable for candidacy. The recruitment process is not very fair for candidates as it is a one-sided process to favor employers. There is relatively little respect or consideration for candidates during the application or through a conscious feedback process from the employer. In many cases, GDPR processes at organizations may not even be fully compliant nor provide transparency in regards to how the personal details have been processed and stored of candidates. Recruiters invariably may pass on CVs to managers who may then pass on CVs to other members of the team, all the while such personal details are being stored on multiple email accounts and may even get printed out in hardcopy. One may even notice the reckless use of candidate CVs as scratch paper. In some cases, consent to pass on details may be provided to recruiter but for whatever reason the recruiter may not decide to represent the candidate, during this time the candidate may not have any feedback of where and how their personal details have been processed, stored, or passed on. There needs to be more done to protect the rights of individuals and their personal details both when they are applying as candidates and when they transition into employees. Managers and recruiters seem to forget that the very candidate they mistreat, disrespect, or are inconsiderate towards during an application process could one day become the founder of a company that they may want to work with in the future. An individual deserves just as much respect when they are a candidate, when they are an employee, and as an employer. Can AI really be a solution towards solving many of the above issues created by humans? Perhaps, only if, managerial politics and biases in organizations can be removed from the equation.

20 July 2020

Rust Is Not Yet A Better Language

  • Questionable and at times dodgy Rust arithmetic
  • Functional calls touch memory twice in Rust
  • Rust is not faster than C++
  • Unproven safety mechanism in Rust
  • Painful rewrite of C library headers
  • Compilation times are slow
  • Rust is a pain, lacks transparency, and inconsistent to work with compiler where some things are documented rather than properly checked
  • Integration with other languages is difficult
  • Rust has a bigger assembly code footprint than C++
  • Unsafe blocks are not checked
  • Most of enterprise and technology products are built using C/C++/Java interface where a complete rewrite might be required for Rust
  • Rust doesn't play nice with other languages
  • Rust ecosystem tools are insufficient for prime time use
  • Tedious and verbose
  • No formal community specifications and release process

18 July 2020

L4-L7 Network Services Definition

L4 - Transport e.g TCP/UDP
L5 - Session e.g. connections
L6 - Presentation e.g. SSL/TLS
L7 - Application e.g. HTTP/SIP


Set of functions for: Firewall, Load Balancing, Service Discovery, and Monitoring.

15 July 2020

Optimization

When should one optimize their implementation? Definitely, not during a prototype stage! Also, definitely don't spend time optimizing an incorrect solution that doesn't solve the business case and wastes valuable man hours. Most software development practices follow three core principles: implement, test, refactor. At the implementation stage one is trying to codify the algorithmic logic. The testing stage tests for the implementation logic. Once the implementation has been tested, it can be refactored. In fact, in agile practice such processes may even be cyclical against granular pieces of functional code that work against a scoped feature. Generally, code practices dictate loose coupling and high cohesion. While treating algorithmic implementations as functional blackboxes against their parameters. It is only after refactor stage can one start to look at aspects of performance, reliability, and scalability. Even at this stage, one has to first identify the bottleneck as premature optimization is bad. Definitely, profiling helps. And even better approach is to continuously build/deploy the implementation artefact for isolated testing where one can throw data at it in a simulated mode. In fact, when deploying an implementation to the cloud one has to view constraints both at a system-level as well as application-level. And, in many cases, optimization constraints may be negligible enough at application-level that it can be offloaded to the system-level. 

  • Don't bother optimizing a solution in prototype mode, focus on solving the problem (what if the solution is incorrect, one might be wasting time optimizing an incorrect solution)
  • Focus on testing the implementation
  • Once the implementation is correct, refactor it - focus on high cohesion, loose coupling
  • Keep implementation loosely coupled from third-party libraries (also, don't get hung up on such things as whether the third-party library is using a c implementation for optimization, it is more important at this stage to make sure the algorithmic implementation is correct)
  • Treat third-party libraries as dependencies
  • Use profilers and correct metrics to check for performance
  • If there is a bottleneck identify where it is at application-level or system-level
  • Only optimize as an afterthought and when a bottleneck is identified (How can one optimize for something without knowing where the bottleneck is? In some cases, through experience, one might even know earlier in the implementation where a bottleneck might occur, in which case, eager optimization can be compensated from experience and may in fact be beneficial to save time later in the process)
  • Don't optimize for the sake of optimizing, it may be unnecessary, especially in the cloud
  • When a dependency is the bottleneck, replace it with another, more performant dependency or create own
  • Swapping dependencies should not affect the algorithmic implementation (as long as the algorithmic implementation to use is also correct in the third-party dependency), which is the whole point of using functions as blackboxes that provide parameter passing. Create a wrapper if needed. Use appropriate best practices and patterns.
  • Is the bottleneck at application-level negligible enough to be offloaded to system-level cloud infrastructure?
  • Don't eagerly optimize early and often - premature optimization only leads to more issues and complexity
  • Only optimize when there is a logical need to do so (be pragmatic)
  • With increasing level of experience, one can deduce when, where, and how to optimize for the outcome of results

"The real problem is that programmers have spent far too much time worrying about efficiency in the wrong places and at the wrong times; premature optimization is the root of all evil (or at least most of it) in programming. " - Donald Knuth

"Programmers waste enormous amounts of time thinking about, or worrying about, the speed of noncritical parts of their programs, and these attempts at efficiency actually have a strong negative impact when debugging and maintenance are considered. We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil. Yet we should not pass up our opportunities in that critical 3%. " - Donald Knuth

29 June 2020

Useless Managers

In many organizations, managers are clueless. They don't appear to be managing anything let alone their own work. One will often find them in meetings, running around doing their personal chores outside of work, talking on the phone about things that are unrelated to work, and unapproachable to their employees. They will often act as a series of buffer placeholders in a role function to formulate red tape from the interaction of employees. When the organization performs poorly they will look to make a selection of employees redundant while maintaining their own jobs, even earning a bonus or a promotion for such an action. Invariably, most of their actions are counterproductive for teams as they almost always have a selfish personal game plan and are often seen instigating politics. They will interact with employees with an air of authority while having little to no practical skills in effective management. Even their personal performance reviews are purposefully planned to garner as much end of year bonus as possible away from employees. Currently, in organizations there is little to no accountability for managers. In some organizations, there are more managers allocated to teams compared to the actual employees doing the work making it highly inefficient and not very cost-effective. Considering the fact, that an organization is non-functional without its employees, perhaps the role of the manager needs to change or removed entirely towards a more flat structure of working practices. In most cases, a manager's role can be replaced by artificial intelligence. On other hand, perhaps the manager needs to be treated as a sub-ordinate to employees so as to enforce better management practices for teams that is more nurturing and approachable. And, rather than making hundreds and thousands of employees redundant over failure to meet business strategic goals, the responsibility should rest on the shoulders of the managers who are then sub-ordinates and accountable to those employees and the primary ownership team. In fact, such approaches might increase productivity, reduce turnover, higher job satisfaction, drive more creativity,  more effective ownership of work, and provide options towards a better performing organization in its success factors. Managers in many organizations lack emotional intelligence, use politics to build self-centered relationships, lack the necessary self-management skills, lack basic self-awareness skills, lack basic common sense, are usually neither company fit, nor are they team fit, generally come with very poor technical skills, lack the necessary experience, and lack the necessary maturity to drive cohesive management in organizations.

Loss of Creativity in Academics

People in field of academics are one of the most uncreative individuals. And, in general, people that are toppers at university and school end up being really bad with innovation. The more educated one gets the less one is willing to take risks to think outside the box. Hence, why most successful companies have been established by dropouts. In fact, someone with a Math degree is more than likely to need mentoring in how to think in the workplace as they have very minimum training in applied skills. In life, problems are not pre-defined. We need to discover issues, formulate our own problems, and find solutions all of which is derived from experiences which can't be found from just reading books and passing exams in a classroom. In many cases, the solutions are not typical that they can be derived from a crib sheet. Each problem in life tends to have its unique solution. Uncertainty is also a major contributing factor of life that adds to the variability. Math teaches one how to deal with definite real terms where potentially someone has already defined a problem and a theoretical formula. In life rarely ever anything is a definite surety. Math in many cases is flawed because everything is a theory, which can be proved or disproved by anyone at anytime. In many cases, this is seen as a typical issue in Banking, Scientific, and Economic disciplines because people invariably do not account for uncertainty and creativity when building systems, basing all their structures on theoretical understanding, not only does this make them less reasonable but also quantifiably inaccurate. Creativity is what drives innovation. Academic disciplines in many cases try to inhibit such creativity. In many respects, an artist or a musician is a perfect example of creative expression which seems to be lacking metaphorically in most theoretical disciplines that only want to work against set formulas. Not everything in life has an explainable and pre-defined mathematical formula. In fact, current theories do not fully provide a seemingly comprehensive understanding of the world we live in nor do they enable us to answer every notable question of nature. Complexity of the world we live in is almost understated in theory only so we as humans can understand it in simple terms. However, such simple terms of explainability lack the detail that we require to build replicable systems that can accurately reflect and take inspiration from nature. 

20 June 2020

Tensorflow Tools

  • ML Metadata
  • Data Visualization
  • Serving
  • Tensorflow.js
  • Transform
  • Model Analysis (TFX Model Pusher and TFX Model Validator)
  • Lite
  • Privacy
  • Federated
  • CoLab
  • Probability
  • Graphics
  • Agents
  • Ranking
  • Quantum
  • Magenta
  • TensorRT
  • Tensor2Tensor
  • Tensorboard
  • Extended
  • Sonnet
  • Dopamine
  • Lattice
  • Model Optimization
  • Hub
  • RaggedTensors
  • Mesh
  • I/O
  • TRFL
  • Unicode Ops

Incompetent Graduates

Graduates on the whole can be a nightmare to work with especially if they have come straight out of university with zero practical work experience. It is when they start talking out-of-depth and undermining the experienced people in the team with their lack of experience is when it gets really annoying. Another aspect of questionable hiring is when graduates need mentors in the workplace to teach them everything that they should have been taught at university. Another aspect of annoyance is when they insist on using anti-patterns, while questioning the use of best practices and correct use of patterns. In fact, they have probably never even heard of the best practices and patterns before. Nor do they have the necessary skills to think through practical solutions for business. When things go wrong they are quick to blame others rather than take ownership of their mistakes and learn from experienced people on the team. As a result, they will drag the experienced people on the team to make the same mistakes. Furthermore, one will have to deal with questionable understanding of even the most basic things. On other cases, it is their bizarre comments like "there is no such thing as reverse discrimination", "statistics is not math", "decision trees is not machine learning" that makes them sound utterly clueless. How can you have statistics without the elements of math. Even more annoying is that they will assume they know more than the experienced member on the team and start teaching them basics making themselves sound like a beginner with little to no practical skills. Some of them even need help with google search. They basically are so used to courseworks where someone literally hands everything to them on a silver platter that they assume working on a business product case would work the same way. In fact, they have a tendency of approaching business solutions like they working on a school assignment. On the whole, most degrees should enable a graduate to think and to be resourceful in learning on their own (towards a holistic attitude to self-learning, self-exploration, with the added sense of creativity to extend it in some way). However, it seems universities are not teaching students on how to deal with the complexities and uncertainties. When they come into the real-world they are ill-equipped with the practical skills nor the mindset to think on their own apart from turning up with a shed load of arrogance and a total lack of creativity. Like they think passing a degree means they have conquered the world. The worse things a team can do during a project is to trust graduates especially if they have no practical experience. In fact, they will find themselves quite insecure working around experienced individuals to the point of being disruptive and undermine their work. Employers need to end hiring graduates that have no work experience and giving preference for individuals that have at least some applied skills to share as examples in practical application cases. Education is rarely ever a substitute for experience in any practical field. Just passing an exam will not provide a graduate the necessary skills needed to do the job. All university courses should really be providing at least a year of practical training in a controlled environment either via internships, bootcamps, gap-year programs, or an element of project work as an application accelerator of applied theory in their courses outside of assignments and exams that replicates the practicalities of solving problems in the real-world. So, when they take on full-time position they can be productive and have the correct mindset for business application. Many universities are also failing to teach graduates the basics of morality and ethics as part of the course in order to maintain their professionalism when they embark on their careers where they will be interacting with all sorts of people and dealing with workplace issues. If employers have hired graduates correctly, there should never be a need to provide them with formal mentors in the workplace and excessive expectations for handholding should not be the norm. On the other hand, employers that listen to inexperienced graduates and allow them to cluelessly dictate project work often lose out considerably with missed deliverable targets, badly executed products, and extended time spent in training graduates for even the most mundane things that leads to a source of frustration for teams. Not to mention a contradiction, on one hand they require mentoring for their lack of experience, and on other they try to dictate project work in teams which is bound to lead to catastrophic failure.

13 June 2020

10 June 2020

Applications of Metaphor Processing

  • Creative Writing
  • Joke Generators
  • Figurative Information Retrieval
  • Narrative Generators
  • Sentiment Recognition
  • Persuasive Marketing
  • Commonsense Reasoning
  • Political Communication
  • Discourse Analysis
  • Reading Comprehension
  • Review Generators
  • Poetry Generators
  • Lyrics Generators
  • Slogan Generators

20 May 2020

GNN Datasets and Implementations

Citation Networks:

  • PubMed
  • Cora
  • Citeseer
  • DBLP

Biochemical:

  • MUTAG
  • NCI-1
  • PPI
  • D&D
  • PROTEIN

Social Networks:

  • Reddit
  • BlogCatalog

Knowledge Graphs:

  • FB13
  • FB15K
  • FB15K237
  • WN11
  • WN18
  • WN18RR

Repos:


Implementations:


GNN Models:

  • GGNN
  • Neural FPs
  • ChebNet
  • DNGR
  • SDNE
  • GAE
  • DRNE
  • Structured RNN
  • DCNN
  • GCN
  • CayleyNet
  • GraphSage
  • GAT
  • CLN
  • ECC
  • MPNN
  • MoNet
  • JK-Net
  • SSE
  • LGCN
  • FastGCN
  • DiffPool
  • GraphRNN
  • MolGAN
  • NetGAN
  • DCRNN
  • ST-GCN
  • RGCN
  • AS-GCN
  • DGCN
  • GaAN
  • DGI
  • GraphWaveNet
  • HAN

Deep Fact Checking

In general, a fact verification attempts to obtain supported evidence from text in order to verify claims. The labels can contain "supported", "refuted", or "not enough info" to classify a claim. In many respects, this is a natural language interpretation process of entailments. Some methods in this process may incorporate evidence concatenation or individual evidence-claim pairs. Unfortunately, such methods are limited in sufficiently identifying relational and logical attributes of information from the evidence. In order to integrate and reason over several evidences, one has to utilize a graph network for aggregation and reasoning to enable a connected evidence graph with a means of identifying information propagation. A deep workflow process using deep learning with graphs is one approach. The first step in the process is to use a sentence encoder with Bert. The second step is to combine evidence reasoning with aggregation in a modified graph attention network. DAGs can further be utilized for relation and event extraction representations and linkage.

12 May 2020

Text Production Datasets

Data-to-Text Generation
  • WikiBio
  • WikiNLG
  • SBNation
  • RotoWire
MR-to-Text Generation (Meaning Representation)
  • SR'18
  • E2E
Text-to-Text Generation
  • Summarization (DUC2001-2005)
    • CNN
    • DailyMail
    • NYTimes
    • NewsRoom
    • XSum
  • Simplification
    • PWKP
    • WikiLarge
    • Newsela
  • Compression
    • Gigaword
    • Automatic Creation of Extractive Sentence/Compression
    • MASC
    • Multi-Reference Corpus for Abstractive Compression
    • Cohn and Lapata's Corpus
  • Paraphrasing
    • MSRP
    • PIT-2015
    • Twitter News URL Corpus
    • ParaNMT-80
    • ParaNMT-50
    • MTC
    • PPDB

4 May 2020

Social Mixed Reality

In times of social distancing, people still want to be able to connect. Bars are empty. Parks are relatively empty. Seems like the linear social networking sites have become a bit of a trite concept. The next phase of socializing on the web is likely to be in the form of mixed reality - combining some sort of augmented reality with virtual reality concepts. This concept of virtual connection will enable people to meet in all sorts of different ways and maintain a safe distance if they need. It will also help people that are medically ill in hospital or busy at work to be able to connect as well without physically leaving their place of location. It will also enable people to connect across the globe which means reduced need for frequent travelling. Communications is likely to take on new forms of medium as people live more complicated lives with socially unique circumstances. In fact, enabling people with children to socially connect consciously as well within safe environment controls. Such virtual environments may also extend into remote support work to customer service and sales/marketing interactions. Similarly, it can also be extended towards socially connecting collective religious prayer such as for churches, synagogues, mosques, and temples allowing people to not have to physically visit such places while still making their regular rituals of worship remotely. In many respects, non-muslims, may want to visit mecca, where mixed reality could allow them to have a near real-time experience. On other hand, people may want to visit jerusalem from the convenience of their home. On other hand, an individual might want to join a global network of virtually connected prayer gatherings, even a social party scene like a wedding, concert or a festival. 

17 April 2020

When is a University Degree Pointless?

  • When you need mentors in workplace to teach you how to do everything?
  • When you need help using google to search for information?
  • When your only way of learning anything new is by asking or expecting others to show you how it is done?
  • When you can't seem to understand anything and often in workplace you use phrases like 'I don't understand'?
  • When you have no applied skills even if it is intuitive or requires basic common sense?
  • When you can learn more by doing rather than by sitting in a classroom?
  • When your entire objective of learning is to pass an exam and/or course?
  • When you spend your time sharing academic theory but have literally no practical awareness of how to apply any of it?
  • When you can understand everything by just picking up a book or an online tutorial then applying it yourself?
  • When you understand the advanced theoretical concepts but have no clue about the basic mechanics of it?
  • When your lack of academic integrity extends into your bad work ethics and behavior?
  • When you share an air of arrogance about having achieved a degree by looking down on people with many years of practical experience in workplace?
  • When you have little respect and consideration for others in workplace?
  • When you unable to meet and agree on sensible timelines?
  • When you struggle to micro-manage and organize your own work habits?
  • When you sit there in workplace playing politics and blame games with everyone?
  • When you are not proactive and sufficiently resourceful in the workplace?
  • When you struggle to make decisions and reason through things especially when things go wrong?
  • When you need help with everything including things that are a no-brainer?
  • When you outright dismiss new ways of doing things, as part of your narrow-mindedness, without keeping an open-mind for critical exploration and assessment?
  • When you don't give credit where credit is due?
  • When you exhibit discrimination, racism, and biases in workplace in the way you treat others?
  • When you expect the opposite gender to clean up after you and treat them as less equal to you?
  • When the manifestations of your mannerisms and attitudes display a lack of professionalism?
  • When you feel the need to argue about everything even things that don't require a discussion?
  • When you find it difficult to adapt to change?
  • When you prefer to work in a routine?
  • When you don't learn from your mistakes?
  • When your degree is unrecognized and unaccredited as a valid degree?
  • When you find yourself questioning the value of your degree and the time spent towards achieving it?
  • When you have sufficient practical experience that no one bothers to even care what university you went to and what degree you earned?
  • When after earning a degree you are still doing a menial job that didn't even require a qualification?
  • When majority of your learning happens in the school of life rather than in a classroom?
  • When you are still clueless about what you want to do with life and how getting a degree will make that happen?
  • When you look back and you realize you should have done a degree in another subject instead?
  • When you are still a reckless and unproductive mess in society?
  • When you realize your intention of attaining a degree was to please your parents rather than being aligned with your interests and talents?
  • When you use wikipedia references as a source of your knowledge?
  • When you rely on conspiracy theories, gossips and rumors as a way to justify your claims?
  • When you only know how to regurgitate things?
  • When you can't see any sense in applying best practices?
  • When you discredit others based on their backgrounds rather than objectively evaluating the quality of their work?
  • When you try to take credit for other people's work?
  • When the name of an institution you attended is more important and significant in value to you in comparison to the content of the course?
  • When you disrupt and undermine other people's work while requiring help to do your own work?
  • When you unethically use, exploit, and walk over other people to help advance your own career?
  • When you had to bribe your way into earning a university degree?
  • When you had to pay someone on a crowdsourcing site to do the entire coursework or dissertation for you?
  • When you cheat your way through an exam or a coursework?
  • When your degree course has literally no coverage on applied ethics in broader and narrower terms whether as part of university-wide or course-specific initiative?
  • When your investor only cares about what university you went to rather than the potential of your proposed product?
  • When your employer or interviewer cares more about what university you went to rather than your skills and experience?
  • When you try to spend time optimizing a solution even before you have fully understood the business case and solved the problem in a prototype?
  • When you are unwilling to challenge the norm and status quo especially if it is incorrect nor the better way of doing things?
  • When you can't be bothered to read the documentation?

English Wordnet

English Wordnet

25 March 2020

Fake Data Scientists

How to spot the fake data scientist?
  • they have no clue about the processes of a data science method
  • they skip the feature engineering part of the data science method
  • they require data engineers to provide them cleaned data through an ETL process
  • they need a whole team of technical people to support their work
  • they are only interested in building models and the models they build inherently are almost always overfitted as they never bother to do the feature engineering work themselves
  • they don't consider creating their own corpus as an important step of model build work
  • they don't understand the value of features when training a model to solve a business case
  • they have no clue how to scale, build, deploy, evaluate their models into production
  • they think with a phd they know everything but practically they are zero
  • they rarely bother to understand the business case nor ask the right questions
  • they don't know how to augment the data to create their own corpus for training
  • they don't know how to apply feature selection
  • they don't know how to generalize a model so they are sat there re-tunning their overfitted models
  • they spend years and years sitting in organizations building overfitted models when they could have built generalizable models in weeks and months 
  • they don't understand the value of metadata or the value of knowledge graphs for feature engineering
  • they raise ridiculously dumb issues during agile standups like they have built a model but it doesn't have certain features (i.e they skip the feature engineering step)
  • they build a model straight out of a research paper and assume the exploratory step is the entire data science method
  • they use classification approaches when they should be using clustering methods
  • they are unwilling to learn new ways of doing things nor are willing to adapt to change
  • they prefer to use notebooks rather than build a full structured implementation of their models that can be deployed to production
  • they build models that contain no formal evaluation or testing metrics
  • they only partially solve a business case because they skipped the feature engineering or passed that effort to a data engineer
  • they are only interested in quantitative methods and not willing to think outside the box of what they have been taught in academics
  • they build academic models that are not fit for purpose for production nor do they add business value
  • they require a lot of handholding and mentoring to be taught basic coding skills
  • they struggle to understand research papers nor the fact that 80% of such research work is useless and of no inherent value
  • they literally assume that something is state of the art when it is mentioned in a research paper rather than contextualize the model appropriateness to solve a business case
  • they don't bother to visualize the data as part of exploration stage
  • they don't bother to do background research to identify use cases where a certain approach has worked or not worked for a business
  • they don't bother to look at reuse appropriately
  • they have no understanding of how to clean data
  • they try every model type until something sticks
  • they don't have clarity on how the different model types work
  • they don't fully understand the appropriate context of when to apply a model type
  • they only know very few model methods and how to approach them for a narrow set of business cases
  • they don't understand bias and variance
  • they don't know whether they want accuracy or interpretability nor how to pick
  • they don't know what a baseline is
  • they use the wrong sets of metrics
  • they incorrectly apply the train, validation, test split
  • they go to the other extreme of focusing on optimization before actually solving the problem
  • they have a phd and the arrogance to match, but literally no practical experience of how to be productive in applying any of it in the workplace especially against noisy unstructured data
  • they come with fancy phds and spend time teaching others how to do their job, but usually require the help of everyone on team to do their own work
  • they come with a phd in a specific area but have no willingness to understand other scientific disciplines in the application of data or have a tendency to outright dismiss such methods
  • they think AI is just machine learning
  • they want someone to hand them a clean dataset on a gold platter because they can't be bothered to do it themselves nor do they think it is an important aspect of their work
  • they can't seem to think beyond statistics to solve a problem
  • they have tendency of looking down on people and dismissing any one that doesn't hold a phd
  • they struggle to understand basic concepts in computer science
  • they need a separate resource to help them refactor their code nor will they be bothered to do it themselves
  • they find services like datarobot helps their work in automating machine learning especially feature engineering which inherently allows them to build overfitted models much faster
  • they can't tell the difference between structured and unstructured data
  • they don't have a clue how to deal with noisy data
  • they not very resourceful in hunting for datasets as part of a curation step
  • they need to be shown how to google for things and basically someone constantly showing them how to do things to be practical in the workplace
  • they prefer to use GUI interfaces that allow them to simply use buttons and drag/drop to build models rather than hand build it themselves
  • they state that they have been a data scientist for last 20 years when the field only went mainstream in industry for last 4 or 5 years (an indication of the designated role is evidence from when it first started advertising on recruitment boards and within organizations)
  • they want to apply machine learning to everything, even where it may be overkill
  • they hold phd but are more than happy to plagiarize other people's work and try to take credit for it, in many cases their bit is probably just exposing it as an API
  • they hold a phd but try to take credit of the entire work, even when someone else or an entire team has probably done 80% of their work
  • they use personal pronouns like 'I' in most cases, but rarely do they use 'We' when working in the team
  • they only care about their inputs, outputs, and dependencies for building a model rather than being flexible, considerate, and thinking as a team in looking at the bigger picture
  • if your 'head of data science' uses terms like 'I don't understand' to the point of annoyance then it is a likely indication of their technical incompetence and ability in that capacity
  • they think decision trees is just a bunch of rules and not a type of machine learning technique

9 March 2020

Individualism & Economic Order

Individualism & Economic Order

Informal To Formal Ontology Terminology

  • building a model of some subject matter -> building an ontology
  • things -> individuals
  • kinds of things -> classes
  • generalizations/specializations -> subClassOf
  • some kind of thing -> instance or member of class
  • literal is of a certain kind -> has a datatype
  • relationships between things -> object properties
  • attributes of things -> data properties
  • kinds of literals -> datatypes
  • saying something -> asserting a triple
  • drawing conclusions -> inference