Mabble Rabble: 2021

27 December 2021

22 November 2021

24 August 2021

21 August 2021

9 August 2021

Information Extraction Systems

Imojie
OpenIE
ClauseIE
RoIE
CompactIE
RnnoIE
M2OIE
TextRunner
Graphene
MinIE
OllIE
Reverb
Exemplar
Kraken
NaiveOIE
SRLIE
DetIE
Relnoun
Woe
Twical
TwitIE

6 August 2021

1 August 2021

Pre-Trained Deep Image Models

SqueezeNet
ResNet
EfficientNet
WideResNet
VGG-16
InceptionV3

17 July 2021

8 July 2021

7 July 2021

1 July 2021

Customer Satisfaction Metrics

CX = Customer Experience

NPS = Net Promoter Score (use for long-term customer loyalty)

CSAT = Customer Satisfaction Score (use for short-term customer loyalty)

CES = Customer Effort Score

NPS useful for:

Growth Targeting
Brand Loyalty
Happiness
Feedback
Honesty
Overall Experience

30 June 2021

Geometric Deep Learning Libraries

PyTorch Geometric
Graph Nets
Deep Graph Library

29 June 2021

Types of Embeddings

Word
Graph
Sense
Contextualized
Sentence & Document

28 June 2021

What Does An AI Researcher Do?

There is nothing that an AI Researcher does that an AI Engineer cannot do. In fact the following things are a majority aspect of a researcher's job function.

They research new ways of doing things, the magic word here is explore. They don't build systems that can scale to hundreds and thousands of data points, for that they need an AI Engineer.

They read research papers, which an AI Engineer also does and apply them in context

They stay current on AI Trends, which an AI Engineer also does and apply them in context

They hold a Phd, which an AI Engineer may/may not hold nor necessarily care for to do their work

They spend time building novel algorithms, which is a bit questionable as they have poor programming skills and lack the ability to scale any of the algorithms, on top of which to solve a problem they will need practical experience which one does not get from reading research papers. Again, an AI Engineer can pretty much do this and likely better, and have the mindset to apply design patterns by doing it the correct way from prototypes to scalable solutions.

They get sponsorship funding for research work, which they get from holding a Phd but not necessarily from actually having skills at converting theory into practice for that again they need an AI Engineer. They also lack the ability to manage data and what they have learned they find it difficult to adapt to change in research approach. They seem to stick to what they know rather than learn what they don't know as part of adapting to change. They rarely like to approach things outside of the boundaries of their so called specialist area. Funnily enough, it is in most cases a research assistant at academic institutions that tends to build the practical application of theory not the professor.

They are specialized in a certain field, have likely taught in academia, given conference talks, and have published papers in the area. One can't be a specialist in an area if they have no practical skills of applying any of it. It is questionable whether any of what they publish is even worthy of research, precisely why 80% of research coming out of industry and academic institutions amounts to nothing. And, the 20% that does get classed as a research breakthrough tends to be built by someone sponsored by an organization, looking to solve a certain practical issue, who doesn't hold a Phd, and likely with a practical engineering background. Furthermore, it is surprising that many AI Researchers make poor teachers. Invariably, many, also dislike the aspect of teaching but have to as part of being linked to an academic institution.

They collaborate on standardization effort and open source artefacts as part of converting theory to practice. Even here majority of open standardizations are done by people with practical experience who understand the gaps in an application area. An AI Researcher rarely produces anything of significant value but usually tries to take credit for a large majority of the effort which is likely done by an AI Engineer.

One can notice that practical application is far more important than theory. It is from practical experience one can understand problems and extend towards a solution. There is also no one way of solving a problem. Basic theoretical background can be achieved without going to university or even attaining a Phd. There are books and online courses for literally anything and everything one can think of.

They have greater opportunities for academic influence and research. This may be true because they have built up a network within the area for collaboration. However, most academic institutions get their funding from either government trusts, grants, or private sector. Unless the AI Engineer has a sponsorship or a network of associates that can provide funding for their work, they may be at a slight disadvantage. AI Researchers also as a result of networking tend to have a higher sense of reach towards influence but even this can be targeted back to an AI Engineer who develops influence through practical applications. One way of beating an AI Researcher at their own game is to build open source projects, provide published papers for them, and build a portfolio of practical solutions delivered to organizations. One doesn't need an advanced degree for practical achievement. Invariably, it is far more important to have the tenacity, curiosity, and enthusiasm to learn, explore, and extend towards building a practical novel solution.

AI Researchers tend to have a more focused academic background, with limited practical experience, as defined on their resume with publications and conference talks, but an AI Engineer tends to have a stronger practical bias where they may or may not have published papers. However, at times the job functions may be interchangeable given the confusion and disarray in communication at most clueless organizations. A Phd background is not an automatic pass to the person being an expert in that area. One may at times come across people calling themselves AI Researchers with a Phd but in fact are completely clueless about the field or the application of it. One has to be mindful when hiring such people. There are many ways of spotting the fake AI Researcher, many of which relate to their lack of objectivity, their questionable attitude, their questionable understanding about a topic, their confused sense of ethics, and their lack of critical evaluation process:

they confuse Neural Linguistic Programming (NLP) with Natural Language Processing (NLP)
they lack professionalism and respect when interacting with non-Phd people
they have a history of practicing academic dishonesty which in the work place interaction converts into unethical practices and false sense of entitlement
they are often hypocritical with their notion of ethical principles and code of conduct
they don't treat others with respect and have an in grain sense of being overly defensive and biased in their communication
they have published papers that show very little critical evaluation
they have published papers that are likely plagiarised
they have published papers that are not theoretically correct
someone else may have written a published paper for them, for which they took credit, eg via crowdsourcing, most of the work done by a supervisor to meet passing research indicators
they are generally clueless and contradicting themselves with their own actions and explanation and digging themselves into an even bigger hole of illogical thinking
they haven't really published much after their thesis work
they have a mediocre citation score to majority of their papers
they lie about their background
they violate basic privacy laws during meetings, are rude in their interaction, or appear insecure by trying to invalidate others
they are not self-critical of their own work and of their own deficiencies, they spend more time criticizing others rather than self-reflection
they think they know what they are talking about just because they hold a Phd
they have peculiar mannerisms and the way they come across about a topic makes you question their qualifying background
they display an unwelcoming or condescending attitude
they like to use a lot of flowering language and impress upon how busy they are, even if they aren't that busy at all
they will try to impress upon their background, but will get caught in using incorrect terms, incorrect logical thinking, and automatically invalidate their position
they may use a lot of assumptions in their speech without backing their claims
they are unable to translate anything into any form of practical output, and what little they produce is packaged up as an API wrapper around someone else's work
they will be enamored by the academic institution that they attended for their Phd, keep using that as their defense, but have little to no contextual understanding about concepts in any level of technical depth when applied in practice
it is very easy to put them on the spot and expect them to remain speechless of being caught out
they have a tendency of using cognitive biases in their actions and speech
the way they interact, if the workplace CCTV was played back to them, it would not only be embarrassing, but display both their rudeness and unprofessionalism

Open Standardization In Artificial Intelligence

Open Standards in Artificial Intelligence

Knowledge Graph Embedding Libraries

Ampligraph
OpenKE
Scikit-KGE
OpenNRE
LibKGE
PyKG2Vec
GraphVite
PyKeen
DGL-KE

27 June 2021

Incompetent Data Scientists

What does a Machine Learning Engineer do?

Everything a Data Scientist is suppose to be able to do

What does a Data Engineer do?

Everything a Data Scientist is suppose to be able to do

What does a Knowledge Engineer do?

Everything a Data Scientist is suppose to be able to do

What does a NLP Engineer do?

Everything a Data Scientist is suppose to be able to do

What does an AI Engineer do?

Everything a Data Scientist, a Machine Learning Engineer, a NLP Engineer, a Knowledge Engineer, a Software Engineer is suppose to be able to do

So, why isn't a Data Scientist doing any of this and why so many additional roles? Why not just hire an AI Engineer and save the mumbo jumbo of roles?

Because, majority of Data Scientists in industry only like to build models and the models they build are not suitable for prime-time production nor are they evaluated correctly, nor are they built correctly. Basically, they are incompetent, many hold Phd degrees, and are never taught basic Computer Science concepts. In industry, roles are created to fill gaps in particular skills. Shockingly, all the above roles can be done by anyone with a Computer Science degree. Because, Machine Learning is part of AI, Knowledge Representation and Reasoning is part of AI, NLP is part of AI, Data Engineering is part of Computer Science, Software Engineering is part of Computer Science, Data Science is part of Computer Science, and AI is part of Computer Science. And, all these concepts are taught in a Computer Science degree. The industry has become a hodgepodge of roles because of hiring incompetent Phd people that need help with literally everything for them to do their job. And, the job they do amounts to nothing because everyone else is pretty much doing the job for them. Even the aspect of research requires some artefact to be produced which they need help with to complete their work. In fact, even a peer reviewed paper they need others to help them peer review. The useless Phd researcher is more a redundant role in industry and as accounting paperwork would show that hiring one of them leads to hiring a whole list of other people to help them do their work. Badly designed academically inclined recruitment processes, badly designed academic courses, it is a right mess that Phd individuals have created in industry and at academic institutions with a total lack of ability to convert theory into practice. Such things are only going to get worse, as more organizations look to hire clueless Phd people with the false pretence of thinking that they actually have any practical experience and expertise in their respective domains. In fairness, they likely to be more hypocritical, arrogant, egotistic, and come with huge amount of shortcomings in both their approach to work as well as in the practical application of it. Fundamentally, a Phd individual lacks the mindset to think in terms of abstractions, constraints, in terms of distributed systems, does not take into account real-world complexity, nor accounts for aspects of uncertainty that come with modeling data and the relevant scope for error. There are just too many people in industry that are clueless that will just follow the crowd and never really understand whether any of it makes sense. In fact, management and investors, in particular, tend to dictate pretentious attitudes to recruitment and to designated work roles. So long as, organizations keep desiring Phd individuals, they will have to keep recruiting more engineers to support them with everything - what an utter waste of budgets, displaced talents, and resources. The areas of Machine Learning, NLP, and Knowledge Graph techniques have been around for decades compared to the more recent role of a Data Scientist. In fact, traditionally, the role of Data Scientist did not even incorporate aspects of AI, this has been a fairly recent addition to the role function. Traditionally, Data Science used to be mostly about the application of a limited subset of Machine Learning approaches in the context of Data Mining which now overshadows the domain of Data Analysis. Even here, many people in industry make the assumption that AI is all about Machine Learning or that Data Science profession is all about Machine Learning application which could not be further from the truth. Organizations need to re-evaluate their hiring requirements and hire such roles with an engineering mindset where the person is involved in the entire end-to-end data science method rather than a Data Scientist whose only interest is in building Machine Learning models with a subjective evaluation that are inherently overfitted to the data with little to no appreciation of the entire work effort.

26 June 2021

Fake Devops Engineers

Takes them 1 to 2 months to spin an instance on the cloud when it should take a couple minutes at max (the whole process literally takes a few seconds on most cloud environments), apart from additional time for setting up security groups which should take 2 days or possibly a week.
Negating everything you say, then using your suggestions as their own
Taking longer than is normal to provision and setup an environment
Having excuses for everything when things go wrong
Playing blame games
Not provisioning sufficient monitoring and automation services
Have they ever attended a devops conference?
They prefer windows to linux environments
They get frustrated very quickly at the most silly things
They confuse ops with devops
They find it difficult understanding that any regular polygon can fit into a square (this is a typical case of being able to understand abstractions which even works at identifying a fake architect)
Don't understand infrastructure as code
Don't understand the relationship between development and operations
Don't understand how to manage and use automation
Don't understand what small deployments means
Don't understand what feature switches mean
Don't understand how to use nor heard of Kanban, Lean and other Agile methods
Don't understand how to manage builds and operate in a high-velocity environment
Don't understand how to make sense of automation, tools, and processes
They don't understand the devops workflow
They lack empathy
They don't understand trunk-based development
They don't understand what a container is used for
They don't know how to manage an orchestration process
They don't know how to manage a staging environment
They don't know what serverless means
They don't understand difference between microservices and monolith
They don't understand immutable infrastructure
They don't know what type of devops specialist they are
They don't know how to create a one-step build/deploy process
They don't know how to instil trust and respect
Not having any favorite devops tools
Not having any specific devops strategies or best practices
How do they decide whether something needs to be automated
They find it difficult to solve issues when things go wrong
They find it difficult to embrace failure and learn from their mistakes
They have difficulty in problem-solving in production environments
They find it difficult to link up tools and technical skills with culture and teamwork
They have a big ego vs a humble nature when it comes to self-characterization
Someone that over-compensates on self-promotion but does not acknowledge their deficiencies

Kubernetes Ecosystem

Backup

Velero

CI/CD

Argo
Flagger
Flux
Keda
Skaffold
Spinnaker

CLI

helm
k9s
ktunnel
Kubealias
Kubebox
Kubectx
Kubens
Kubeprompt
Kubeshell
Kubetail
Kubetree
Stern

Clustering

Eksctl
k3s
kind
kops
Kube-AWS
Kube-adm
Kubespray
miniKube
Gravity
Kaniko
Ingress
KubeDB

Data Processing

Kubeflow

Development

Garden
Makisu
Telepresence
Tilt
Tye
Teresa

Mesh

Istio
Linkerd
Ngix Mesh

Monitoring

Dashboard
Grafana
Kiali
Prometheus
Statemetrics
Kubecost

Networking

Coredns
Externaldns
Kubedns

Security

Falco
Gatekeeper
SealedSecrets

Storage

Rook

Testing

Popeye
k6s
Kube-Monkey

Native

KNative
Tekton
Kubeless

25 June 2021

Bad Processes for Applications And Interviews

Interview and application processes at many organizations generally requires a massive overhaul across the board, but for IT related jobs in particular. Some of the below, highlight the glaring truths about such recruitment practices:

Providing puzzles to solve that no one in their right mind will ever need to do on the job
Providing codility or hackerank tests that anyone can cheat their way through them
Asking bookish questions to test memorization skills or to impress upon the candidate
Asking questions that are totally irrelevant to the role function
Asking one to do a test, like why? Do you ask a builder to build you a sample wall before you hire them to build you a wall?
Coming into an interview with certain assumptions about the candidate even before interviewing them
Asking them badly worded questions like "what gives you energy?"
Asking someone to do pair programming. Do people naturally talk out loud in any job function, ever heard of anyone pairing in other job functions like finance, admin, marketing, operations?
Having excessive amount of interview stages
Having bad attitude or being unprofessional while interviewing a candidate
Hypocritical behavior, like making a candidate wait for a long time for an interview, but having an issue when the candidate is running late
Having silly tests that have no basis
Applying for one role but trying to interview the candidate for another role without candidate consent
Messing around with candidates and wasting their time during application and interview stages
Not providing application and interview feedback. Providing feedback is a requirement if want to be in compliance with GDPR as part of processing and storage of candidate information.
Changing the job role part way through an interview process
Interviewing candidates before securing funding for the work
Not being honest about the job role
Over selling and underdelivering on the job function
Using terms like cultural fit to justify their biases
Expecting certain educational backgrounds that are unnecessary for the job function
Showing interest in a candidate purely on basis of where they got their degree or which company they had previously worked for
Not focusing on candidates practical skills
Giving candidates silly tests to do through out the interview stage
Making candidates feel uncomfortable during interview stage
Not providing water or asking for refreshments before a face-to-face interview
Not listening to what the candidate has to say or not letting them speak
Evaluating candidates purely on basis of likability and subconscious and unconscious biases
Rejecting candidates for roles just because they are women or part of a minority group
Rejecting candidates for roles on basis of religion or other such prejudices
Not being considerate and respectful with candidates
Being overly distrustful and pessimistic of candidates through the entire process
Not answering basic questions of candidates that would help them evaluate the job function
Being difficult, unapproachable, and not being forthcoming with candidates
Taking too long to provide feedback or not providing any at all
Not realizing that job interviews are a two way process
Rejecting candidates for using american english rather than british english for spelling words
Rejecting candidates for grammatical mistakes and being too pedantic
Rejecting candidates based on their looks and appearances
Rejecting candidates based on disability and not being sufficiently accommodating
Changing interview times or cancelations at the last minute
Purging the entire job application database so the candidate who might have spent time on the application has no chance to be reviewed, and likely has to apply again
Advertising for jobs that do not exist
Advertising for jobs but having a preferred source of candidates
Advertising for job where the job title does not match the job description
Advertising for job when the job has already been filled
Using job ads as a marketing gimmick
Asking for age, date of birth, and race on the job application
Not focusing the application and interview to what the job actually requires and entails
Not interviewing candidates on their relevant merits
Using silly benchmarks and psychometric tests
Not reviewing every job application and candidate
Using non-technical people to pre-screen candidates who have no background for the skills required for the job function
Using job titles rather than the context and content of work when evaluating job applications
Screening candidates by keywords and not context and content of work
Making it difficult for candidates to approach organizational recruitment teams for enquires or feedback
Not acknowledging job applications nor the deletion of job applications
Refusing to shake the candidate's hand after an interview
Using cognitive and prejudicial biases to screen a candidate
Having badly designed job application forms
Having poor communication skills as interviewer but expecting amazing communication skills from interviewee
Going off on a tangent and losing focus
Rejecting candidates because they didn't feel comfortable or like your pet dog or cat in the office
Being rude and offensive to candidates
Talking about diversity awareness, but not having much of a diverse workforce in the office, nor displaying an open-mind about diversity of cultures
Using excessive stereotypes and generalizations in communication with candidates
Not being careful with using gender pronouns
Using innuendos whether sexual or otherwise to invade privacy or personal space of candidates
Don't ask silly questions like "how are you" during a lockdown period or when there is a pandemic as you could likely expect a diplomatic answer, in many cases the candidate could find the question on the whole quite inappropriate, given the circumstances of the situation
Don't expect a candidate to have video on during a virtual interview session, in fact you should not even care what the person looks like in first instance

24 June 2021

People That Memorize Things

There appear to exist people in IT industry that memorize literally everything from APIs, Libraries, functions and the whole lot. In some job interviews they will even ask questions that purely test one's memorization skills. Such efforts at memorization are futile. In many cases, it is unnecessary and a pointless exercise in wasted time. One should never have to memorize something that they can lookup or do an autocompletion on, especially if it is being used as a tool to complete work. Technology moves at such a fast pace that new versions are released, APIs are changed, new ways of doing things are introduced, and in time previous methods may be deprecated. In many cases, people that memorize such tools are likely doing it to pass a certification exam. Perhaps, also something to question, is the need for such a pointlessly designed certification. Invariably, memorization is tested by people that are academically inclined, that use it as a yardstick for others, and have very poor practical skills at applying any of it themselves. Phd people tend to fall in such an academically inclined group with poor practical skills. All in all, memorization in most context from academics to practical life adds little value.

23 June 2021

Interpretability Libraries

Lime
Shap
Yellowbrick
Eli5
Lucid
Alibi

Jepsen

21 June 2021

Packtflow

Pactflow

20 June 2021

Why Pure Theoretical Degrees Are Useless

Theoretical degrees are utterly useless in the practical world. However, they may be useful for teaching. Reason being they have zero element for practical reasoning. If one can't apply theory into practice then what is the point of such degrees. Math degrees teach concepts, they provide the formula and they provide the problem. In physics, they provide the problem and they provide the formula to solve for that problem. Such degrees amplify little in application. When such people enter the practical world, they need to be taught how to do literally everything. One wonders where they forgot to think along the way of attaining such a theoretical degree in being able to apply themselves. In real world, one has to find the problem then find a way to solve for that problem, and this is the case ninety nine percent of the time. The only way one can combine such excessive theory is to add an element of engineering to it. In biology, chemistry, and other such courses the degree is transformed into application for medicine, pharmacology and life science disciplines. Any degree that only provides an element of theory is pointless, as it is only good for academic purposes. In practical world one has to be taught how to apply such theory into practice to be productive and useful in society. Increasingly, universities are failing to combine theory with practice, because they aim to meet numbers for educational measures and indicators so as to achieve more academic funding. The biggest mistake employers can make in the IT world is recruit graduates with purely theoretical backgrounds to do practical AI work. Don't hire a math, physics, or statistics graduate to build a machine learning model they will require a lot of mentoring and training. Majority of practical and theoretical AI work requires a computer science background where such material is formally taught in the degree course.

Data Journalism

18 June 2021

Why Pure Probabilistic Solutions Are Bad

In Data Science, there is a tendency to focus on machine learning models that are inherently based on statistical outcomes that are essentially probabilistic in nature. However, to test these models one uses an evaluation method which is also seeped in statistics. Then to apply further analysis on explainability and interpretability one again uses a statistical method. What this turns into is a vicious cycle of using statistics to explain statistics with an uncertainty of outcomes. At some point, one will need to incorporate certainty to gain confidence in the models being derived for a business case. Essentially, knowledge graphs serve multiple cases towards increasing certainty into the models but also for providing logical semantics that can be derived through constructive machine-driven inference. Logic, through relative inference, can give a definite answer while the machine learning model can at most provide a degree of confidence score to which something holds or doesn't hold with a total lack of care for contextual semantics. A machine learning model rarely provides a guaranteed solution as it is based on targets of approximations and error. Hence, why there is a tendency to measure bias and variance in training, testing, validation data. The evaluation is also relatively based on approximations of false positives, false negatives, true positives, and true negatives. Logical methods can be formally tested. A machine learning model can at most be subjectively evaluated with a degree of bias. Invariably, at any iterative time slice a pure statistically derived model will always be overfitted to the data to some degree. Statistics derive ridged models that don't lend themselves to providing definite guarantees in a highly uncertain world. Invariably, the use of statistics is to simplify the problem into mathematical terms that a human can both understand, solve, and constructively communicate. Hence, why there is such a huge statistical bias in academia as it tends to be traditionally a very conservative domain of processing thoughts and reasoning over concepts as a critical evaluation method within the research community. One can say that such a suboptimal solution may be sufficiently good enough? But, is it really good enough? One can always provide garbage data and train the model to provide garbage output. In fact, all the while the statistical model never really understands the semantics of the data to correct itself. Even the aspect of transfer learning in a purely statistical model is derived in a probabilistic manner. The most a statistically derived model can do is pick up on the patterns. But, the semantic interpretability of such data patterns is still yet to be determined for guarantees of certainty in fact it is presumably lost in translation. Even the state of the art model is fairly subjective. Evaluations that only look at best-cost analysis in terms of higher accuracy are flawed. If someone says their model is 90% accurate, one should question in terms of what? And, what happens to the other 10% that they failed to account for in their calculations which is something of an error that the person pipelining the work will have to take into account. Invariably, such a model will likely then have to be re-evaluated in terms of average-cost and worst-cost which is likely going to mean an increase in variable error between 5% and 15%. The average-cost is likely to lie somewhere in 10% and the worst-cost somewhere near 15%. So, 90% of time, in production the idealized performance accuracy of the model would be 90% - 10% = 80% on average-cost, plus/minus 5% on the best-cost, and anywhere between minus 10% and 15% on the worst-cost. This implies that 5% of time the model will perform on best-cost, 5% of time on worst-cost basis, but 90% of time on average-cost where the idealized predictive accuracy, when taken into account the full extent of the error would be 80%. Even though, this is still fairly subjective, at least idealized metric where it takes into account the environment factors improves on the certainty. This is because in most cases a model is built under the assumptions of perfect conditions without taking into account the complexity and uncertainty which would be present in production environment. There is also the need to be mindful, sensible, and rational with the use of the accuracy paradox. One can conclude here, that a hybrid solution of combining probabilistic and logic approach would be the best alternative for reaching a model generalization case of sufficient certainty to tackle the adaptability mechanism for process control as well as to capture the complexity and uncertainty domains of the world.

15 June 2021

Bias, Slant, and Spin

Hydroponics Fruits, Herbs and Veggies

Agrotech industry is a fairly active area of research for AI. The setup requires indoor soil, lighting and AI to make it happen. The below list includes most common fruits, herbs, and vegetables grown in hydroponics:

Cucumber
Tomato
Lettuce
Strawberry
Basil
Coriander
Spring Onion
Pepper
Spinach
Blueberry
Radish
Kale
Green Bean
Chive
Mint
Cabbage
Carrot
Cauliflower
Celery
Beet
Chard
Broccoli
Corn
Eggplant
Leek
Sage
Melon
Onion
Pea
Zucchini
Squash
Parsley
Oregano
Rosemary
Thyme
Chamomile
Dill
Lavender
Chicory
Fennel
Mustard Cress
Lemon Balm
Watercress
Stevia
Peppermint
Tarragon
Bok Choy

11 June 2021

Thorny Weeds of Data Science

In industry, the area of data science is a bit like navigating through a large field of thorny weeds. There are just too many people pitching themselves as experts, that don't understand what they are doing. Many of them with Phd backgrounds who have the complete inability to translate theory into practice. The field is a breeding ground of insecure people in teams with pure academic backgrounds. For everything, they require help, additional resources, adding to the frustration of their peers but also to the management who have to support their work with unnecessary inefficiencies and extensive funding. The patterns of recruitment or lack thereof seem to be typical across many organizations. Often these false hopes of hiring Phd individuals to lead research work in business seems to stem from clueless investors who neither have the interest nor the sense to understand the practical aspects of the field. Interview rounds for candidates becomes a foregone conclusion of misplaced, inept, and aberrant hiring. As a result, the entire organization becomes guilty of compounding issues in hiring incapable, pretentious, and arrogant individuals, lacking basic common sense, who may apparently have steller academic credentials. In many cases, it is questionable on the merit of a Phd qualification, where much of it may have been gained via crowdsourcing platforms or even via the extensive help of the academic advisor for the thesis write-up. Most things that a Phd caliber individual can do, a non-Phd individual can do it better and translate it into a product. If a Phd individual cannot convert theory into practice then what is the point of such a hire? Considering the fact, only one percent of the population has a Phd, is it any wonder why organizations are so ill-informed about calling it a skills shortage when there isn't any to begin with and should really focus on correcting their idealized job requirements. Invariably, organizations learn the hard way when projects fail to deliver and there is no tangible return on investment from a Phd hire. Is this an evidence of a failure of the education system, of the entire technology industry, or perhaps both. Flawed online training courses, mentoring, and certification courses further amplify this ineffective practice. Bad code and models, just breeds bad products to the end-user which ultimately effects investment returns from lack of business performance where targets need to be offset through additional end-user support plans. It still stands to reason, that for any company, people are the biggest asset. Hiring the right people and looking beyond the fold of their credentials is paramount. In fact, hiring astute generalists is far more important than hiring specialists in the long-term. A specialist in a certain area, at a given point in time, is likely to be an outdated specialist in the short-term to long-term cycle of work. Organizations that function as businesses need to strategize their game plan and forecast for the future, which may just be a few quarterly cycles ahead of time. The quickest way to failed startup is to increase costs by hiring Phds and increasing head count of staff to support their work. One needs to wonder why hire so many people to support someone with a Phd unless they are practically incompetent. Academics invariably cannot translate into practice which impacts delivery cycles of work. Phds as a result become the weakest link in many cases, inhibiting and hampering the cycle of productivity and innovation both for the short-term as well as long-term growth of an organization.

10 June 2021

Awesome Semantic Web

7 June 2021

Why Microsoft Products Are Terrible

Tight coupling approach to products and services
Documentation bias towards own products and services
This only works on windows
Plagiarism and stolen code from other vendors
Security risks and software glitches
Business model built on stolen products, services, ideas, and code
Market hijacking
Consolidation rather than any significant innovation
Windows copied from Mac
DOS copied from DEC
Bing copied from Google
Explorer copied from Netscape
All windows versions come with design flaws
Lack of separation between OS and application
Unrepairable codebase
Waste of resources
The dreadful blue screen of death
Unreliable as cloud servers
Trying to do everything but master of none
Terrible network management
Terrible at internet related services and products
Enjoys copying other competitors
Lots of security vulnerabilities
Forced sales targets for substandard products and services
Marketing gimmicks that breed lies and failed promises
Buying open source solutions to kill off the competition
Doesn't support open source community
Works on the vulnerabilities of ignorant customers
Ease of use can be subjectivity and at the detriment to lack of quality
Ignorant users are happy users
Forcing updates and patch releases for security failures in quality
Bad practices and foul play
Forcing users to use windows instead of linux or mac
Vendor lock-in and further use of the cloud to apply the same methodologies
Business as usual with anti-trust
Rigged tests and distorted reality
Bogus accusations
Censorship
Limited memory protection and memory management
Insufficient process management
No code sharing
No real separation between user-level and kernel-level
No real separation between kernel-level code types
No maintenance mode
No version control on DLL
A weak security model
Very basic multi-user support
Lacks separation of management between OS, user, and application data
Does not properly follow protocol standards
Code follows bad programming practices
Anti-competitive practices to block open source innovation and technological progress

6 June 2021

Why Build Your Own Cloud Infrastructure

It benefits organizations in moving away from third-party cloud providers to developing their own in-house cloud strategy. The following highlight some reasons:

Too dependent on third-party infrastructure from cloud provider
Too much trust in third-party cloud provider for your organizational needs
Compliance and privacy breaches from cloud provider
Leaked secrets to competitors from cloud provider
You don't own your own data from cloud provider
You don't know where your data is held from cloud provider
Geo-located third-party services makes it difficult to keep track of governance
Tight coupling to the cloud provider
Have to build design architecture dependent on cloud provider services and orchestration process
Have to build design architecture according to cloud provider access/role services and policies
Cloud provider can block your services at anytime
Cloud provider could be using other third-parties
Cloud provider may lack customer care when you require support
Your service uptime is dependent on cloud provider uptime
If your choice of cloud provider shuts down permanently, it will require a massive migration
Cloud provider decommissions a service leads to sudden re-engineering and re-think of services
Logging anything from cloud provider can be limited and at times problematic in transparency
Cloud provider cost is variable and can change at anytime
Loss of data from cloud provider
Your customers will be effected with the downtime of cloud provider and any lack of support
Control your own destiny, security, orchestration, and architecture
Control your own backups
Control your own data governance and user management
Control your own cost of maintenance
Control your own reliability and scale out needs
Control your own data and storage
Control your own organizational assets
Control your own organizational liabilities
Recruit and screen your own employees that manage your cloud infrastructure (know your employees)
Flexibility to sell your own cloud to other third-parties
Build services that measure up to organizational requirements
Know exactly where your data is stored and meet regulatory requirements for compliance and audit
Build your own AI and data science infrastructure
Make your own cloud strategy fully automated
Make it more responsive to failure and fault tolerance
Build your own secret knowledge graph sauce for your organization using own infrastructure
No longer dependent on specialist resources for your organizational needs
Technology is more advanced than it used to be, things are getting simpler to manage
Don't use Azure, it sucks, an organization is definitely in a better position to develop their own cloud strategy
Don't use GCP, it sucks, an organization is definitely in a better position to develop their own cloud strategy
Don't use AWS, it sucks, an organization is definitely in a better position to develop their own cloud strategy
Don't use some other opinionated cloud provider, an organization is definitely in a better position to develop their own cloud strategy

5 June 2021

Unpredictable Google

Google services are the worst. One minute they are available for use. The next minute they are going through a decommissioning process. Then there is that aspect of their page ranking algorithms which keep changing effecting the publisher revenue. Not to mention the way they have recently been giving preferential treatment through a preferred advertising supplier network. One minute an API is available to use, next minute it is gone. The same is the case on GCP. Nothing seems to stay for very long before it is changed with a total lack of regard for the user. No time frames given for planning a migration. Not to mention the fact to find any information one has to literally hunt for it. One would think if they are a search company they would know how to make their searchable and findability functions user-friendly - but no. And, it takes ages to remove anything from their search engine. The company is also slack in following basic privacy and regulatory compliance. The company just gives off an air of arrogance like they can get away with everything without really being very responsible with user data. There seems to be a complete disconnect across the internal organization which shows in their products and services initiatives. Over the years, with multiple court cases in the international community, Google has been slowly but surely losing the sense of credibility of their services with users. Large company like Google eventually meets its faith when more issues with reliability and security of their services come into question while increasing frustration for their users for their lack of responsive customer care and dodgy business practices. A perfect example of a company that just doesn't care about the end-user.

4 June 2021

DataTuner

2 June 2021

DOAJ

JVM Languages

1 June 2021

Handle.net

handle.net

Star Wars API

Ecological Food Webs

EcoWeb

FoodWebs

28 May 2021

TypeDB

Drawbacks of SHACL

SHACL is a constraint language for validating RDF graphs against a set of conditions. It works in a closed-world assumption when the end goal is an open-world assumption. The validation is only really defined against those set conditions of constraint. There will be cases when validation cases are missed at point of inference. The entire search space can not be tested against constraints defined over a set of conditions. Approach can be seen as rather opposite to the intended goal. What follows from a SHACL validation can lead to a form of reverse engineering leading to a partially closed-world assumption criteria. It may be better to introduce SHACL earlier in the process rather than later in the process so as to avoid conflict of outcomes. SHACL validation tests can quickly go out of hand if acceptance tests against requirements that are defined in a question/answer form become a form of integration validation tests where constraints have overlapping dependencies. One can notice how quickly SHACL tests form unmaintainable targets and impervious constraints that are derived from set of conditions. In all fairness one may want to validate the graph at point of when it is in a state of a closed-world assumption and not after it has been generalized to an open-world assumption.

27 May 2021

TPOT

23 May 2021

Who is regarded as an expert?

Is an expert, someone with years of practical experience or someone with a Phd that doesn't know how to apply any of the theory in to practice? In, most cases, an expert is someone that has a balance of both and the practical experience to match, but not necessarily someone with a Phd. In fact, invariably, an expert does not have a Phd and usually finds the whole approach pointless. However, the way things move in the world, one is an expert one minute, and pretty much an outdated expert the next. It takes at least 10 years of practically applied experience to become an expert at something. By which time, either the approach or the technology becomes outdated. So, is it even worth becoming an expert? In some areas there is no other way. However, in most cases an expert is a pointless notion as it is all fairly subjective. But, conservatives will still focus more weight on academics than on experience. Ultimately, people learn the hard way when their financial pockets run dry and they need someone to deliver on a solution in the shortest amount of time rather than take any further risks with useless Phds who have no clue whatsoever and have provided quantifiably and qualitatively zero results in terms of return on investment to an organization. The safest bet for organizations is to develop a team of astute generalists that can adapt to change who have the ability to self-learn, self-explore, and self-grow in their practical endeavors, which is the whole point of what a degree is supposed to prepare and enable an individual at point of their graduation.

22 May 2021

Key Things Missing In Academics

Academics cannot replace practical skills. Many academically inclined people, that tend to have a Phd, have theoretical backgrounds but lack sufficient skills to convert it into practice for it to be of any use to business. Many academic courses at universities also discount certain key areas that almost always occur in practice or in the practical world would be defined as common sense. The following highlight four core areas that almost always are necessary in solving business problems in practice that are never sufficiently taught in academics:

Rationalization of complexity:

Understanding complexity of the world we live in and the fact something that can be done in academic theory may not be possible in practice, given the resources at the time. Understanding such aspects of computational cost, architectural constraints, input/output dependencies, third-party dependencies, scalability requirements, resilience, latency, load, bandwidth, performance, time constraints, funding limitations, management buy-in, cloud resources, access to data, licensing requirements, state management, copyright restrictions, regulatory requirements, skill availability, and other such complexity constraints.

Rationalization of uncertainty:

Understanding that in practice there are many uncertainty variables, that need to be taken into account, that often may get ignored in academic theory. Often this involves a certain degree of risk. Such risks could take on many forms of economic, geopolitical, social, environmental, government regulation, criminal, accidents, errors, event state outcomes, customer behavior, market demand/trends, failure to deliver, health and safety, loss of team resource, loss of funding, sudden eventual changes, and any such factors outside of immediate control.

Rationalization of noisy data:

Understanding that in practice data is almost always noisy and that someone will not provide you clean data on a silver spooned platter.

Rationalization of context:

Understanding that in practice everything has context, and it is in this context that things can be constructively applied within the bounds of rational pragmatic thinking. There is no silver bullet that can magically solve all problems of the world. Often context understanding comes with practice as it involves all the above key areas of complexity, uncertainty, and being able to handle noisy data. In most cases, a formula is not provided and the problem is not defined. One has to formulate the problem present in the data, and discover a formula to solve such a problem.

As a side note, it seems as the more academic one gets, the more inclined one becomes, at not using common sense. And, even the most simple tasks become difficult to accomplish or require additional help from others. So it seems, academics is a trajectory that promises a lot initially but quantifiably or qualitatively delivers little in return to the individual in society over the long-term.

Matterhorn

16 May 2021

Six Elements of Web Intelligence

Look
Listen
Learn
Connect
Predict
Correct

What is a Knowledge Graph

A knowledge graph has a few key characteristics:

Is a type of knowledge base containing semantically integrated and interconnected graph-structured data
The representation is in a machine-readable standard that follows an open-world assumption
The representation forms an abstraction of a connected graph of relationships
An ability to reason over the connected representation, defined in an open-world assumption, in order to build inference on new relationships
The queryable representation is iteratively evolvable through machine-inference, therefore dynamic and not static
The ability to semantically infer over machine-readable data allows the abstract representation of data to extend from information into knowledge
The semantic context is prevalent and embedded in the knowledge representation, namespaced, and tends to be defined as interlinked URI resources
The representation can be stored in any noSQL database, that is able to support the serialized data format, but for performance reasons it tends to be stored in a native graph-like triplestore or a quadstore
Property graph without the machine-readable representation and inference layer is not a knowledge graph
Without machine-readable representation and inference layer the representation is static, merely data, likely to be defined in a closed-world assumption that can be queried and managed in a database leaving much of the form of inference to the person running the analytics
With additional inference layer and rich machine-readable representation, the database can take a performance hit in both reads and writes, in most cases writes are sub-optimal to reads, and therefore a write operation tends to be done in the background as a periodic bulk function and only a read operation is made available to the end user (an example of this can be seen between Wikipedia, which is available for read/writes, and DBpedia, as knowledge graph which is bulk loaded every so often and made available for read-only)
Many semantic graphs still work on the client-server basis with vertical scaling option of the pre-2000s era, which may pose an issue for heavy read/writes where reliance is on one core server, to avoid downtime make bulk writes offline on a hot swappable as a replicated instance, once resolved then swap the instances
In a knowledge graph, only metadata is stored (data about data which is in a connected and machine-readable form) and not the changing data itself (keep the data separate from the metadata)
A knowledge graph is generally intended for searchability, queryability, discoverability, and findability cases and not for heavy transactional cases
If one wants heavy read/writes then opt for a transactional option, in most cases where such databases may not support inference nor provide semantic compliance, but may provide sharding, horizontal scaling, static property graphs in their serialized data representations which may be compliant with the tinkerpop stack, alternatively, an in-memory analytical graph may also be an option to avoid heavy I/O performance hits (bare in mind, without inference + machine-readable representation it is no longer a knowledge graph but merely a static connected graph-structured data representation stored in a database)
Knowledge Graph is a special case of a Knowledge Base defined as per the graphical representation that it holds in the data abstraction in form of subject-predicate-object triples, however, machine-readable knowledge does not necessarily have to be stored in form of a triple, the W3C have defined a specific set of standards for working with semantic data, but is not necessarily the only way

10 May 2021

Simulation Games

Traffic Light Simulations

Cities Skylines

26 April 2021

Five Phases of AI Project

Definition and Hypothesis (business problem cases and identify value targets)
Data Acquisition and Exploration
Model Building Pipeline and Evaluation
Interpretation and Communication
Automation and Deployment Operations

8 April 2021

Texas Data Fest

Wikipedia Current Events

Semantic Measures Datasets

Manaal Faruqui

ACL

Semantic Measures Library

DKPro Similarity

Semilar

1 April 2021

Equivalence Mining and Matching Frameworks

Ontology Matching Workshop

Ontology Matching

31 March 2021

Three Approaches to Word Similarity Measures

Geometric/Spatial to evaluate relative positions of two words in semantic space defined as context vectors
Set-based that relies on analysis of overlap of the set of contexts in which words occur
Probabilistic using probabilistic models and measures as proposed in information theory

23 March 2021

The Geometry of Concept Learning

17 March 2021

TDD and BDD for Ontology Modelling

The end goal of most ontologies is to meet the semantic representation in a specific context of generalization that follows the open-world assumption. However, such model building approaches can manifest in different ways. One approach is to apply test-driven development and behavior-driven development techniques towards building domain-level ontologies where constraint-based testing can be applied as part of the process. The process steps are elaborated below.

Create a series of high-level question/answering requirements which can be defined in form of specification by example
Create SHACL/SHEX tests as granular to individual specification examples in context. Each, SHACL/SHEX validation basically tests 'ForSome' case as part of predicate logic per defined question where subset of domain/ranges can be tested.
Create BDD based acceptance tests and programmatic unit tests that can test logic constraints
At this stage all tests fail. In order to make them pass implement 'ForSome' closed-word assumption defined in the SHACL/SHEX validation i.e implement the representation so SPARQL query can answer the given contextual question for subset cases. Then make the test pass.
Keep repeating the test-implement-refactor stages until all tests pass the given set of constraints. Incrementally, refactor the representation ontology. The refactoring is more about building working generalizations that can transform the closed-world assumption of asserted facts to the partial open-world assumption of unknowns for the entire set.
Finally, when all tests pass, refactor the entire ontology solution so it conforms to the open-world assumption for the entire set i.e 'ForAll, there exists' which can further be tested using SPARQL against the subsumption hypothesis.
If the ontology needs to be integrated with other ontologies build a set of specification by examples for that and implement a set of integration tests in a similar manner.
Furthermore, in any given question/answer case identify topical keywords that provide bounded constraints for a separate ontology initiative, it maybe helpful here to apply natural language processing techniques in order to utilize entity linkage for reuse.
All tests and implementations can be engineered so it follows best practices for maintainability, extensibility, and readability. The tests can be wired through a continuous integration and a maintainable live documentation process.
Expose the ontology as a SPARQL API endpoint
Apply release and versioning process to your ontologies that complies with the W3C process
It is easier to go from a set of abstractions in a closed-world assumption to an open-world assumption than from an open-world assumption to a closed-world assumption. One can use a similar metaphor of going from relational to graph vs graph to relational in context.
Focus on making ontologies accessible to users
OWA is all about incomplete information and the ability to infer on new information, constraint-based testing may not be exhaustive in the search space, but one can try to test against a subsumption hypothesis

16 March 2021

The Knowledge Graph Conference

11 March 2021

Bioinformatics

Swiss Bioinformatics

Swiss Bioinformatics Resource Portal

10 March 2021

IPFS

5 March 2021

Allotrope Framework

Allotrope is used as a semantic framework in context of scientific data. Or is it really? The framework seems to have received awards. But, on deeper inspection, it looks like a hacked out solution, not only does it increase complexity, it also enforces tight coupling of semantics, lacks consistency, laced with bugs, lacks sufficiency in test coverage, and impedes the data lineage steps necessary for integrating with machine learning components for better explainability. The framework looks like a work in progress, lacks clarity, and accessibility of documentation. In fact, the documentation is inaccessible to public unless one becomes a member. It also only supports Java and C#, with no real equivalent Python API. However, an eventual Python API does appear to be in the works but only time will tell when such a solution is available for use. Although, the use of HDF5 as a reference data format is a good choice, the implementation on the whole as a semantic execution platform is not. And, worst of all most of the testing is enforced via SHACL validation using reversed engineering from open-world assumption to a closed-world assumption especially as data can take on multiple dimensions of complexity, not to mention running into the vicious cycle of unmaintainable and unreadable recursive validation cases where there is no clear cut way for requirement elicitation and subsumption testing. Enforcing what and how one validates semantic data is questionable at best. The framework tries to tackle data complexity with more complexity rather than keeping things simple, accessible, and reusable. After all, it is suppose to be an effort in using standards but seems to be applying them in a very opinionated fashion. That is another case in point, the entire framework lacks reusability and a lot of duplication of the work with reinventing the wheel. Data and design patterns are half-hearted and not well baked into the framework. There must be better ways of doing things than to use frameworks that impede productivity and only increase in the frustration of working with semantic data complexity and constraints where inevitably the solution becomes the problem.

24 February 2021

OpenLegalData

23 February 2021

Lib.rs

22 February 2021

Rouge Metric

ROUGE-N

ROUGE-L

ROUGE-W

ROUGE-S

ROUGE-SU

ROUGE: A Package for Automatic Evaluation of Summaries

15 February 2021

Cooking with Python and KBPedia

Commonsense View of Knowledge Graphs

13 February 2021

WikiFact

29 January 2021

Jurafsky Martin NLP Third Edition

Jurafksy Martin NLP Third Edition

28 January 2021

T5

27 January 2021

Argumentation

25 January 2021

Lexicala

24 January 2021

Alternatives to SKOS

Zthes
MESH
XTM
LIMBER
CERES
ILRT
OWL

22 January 2021

Federated Protocol

Mastadon
NextCloud
PeerTube
Friendica
Mobilzon
Pixelfed
Pleroma
Misskey

18 January 2021

W3C Github

W3C on Github

OpenWebText

13 January 2021

Machine Learning in Rust

Chatbot Evaluations

ChatEval
Acute-Eval
SSA
NUC
SASSI
WER
DSTC
DSTC2
BLEU
PARADISE
QoE
IQ
Perplexity
F1
Hits@k
Average Utterance Length
Ratio of Rare Words
Number of Repetitions
Number of System Questions
Comprehensible
Interesting
Topical Relevance
Response Incorrectness
Conversation Continuity
Engagement
Conversational Depth
Coherence
Domain Coverage
Conversational Diversity and Breadth
Naturalness
Informativeness
Unification Quality
User Satisfaction (Questionnaires)
User Simulations

27 December 2021

22 November 2021

24 August 2021

21 August 2021

9 August 2021

6 August 2021

1 August 2021

17 July 2021

8 July 2021

7 July 2021

1 July 2021

30 June 2021

29 June 2021

28 June 2021

27 June 2021

26 June 2021

25 June 2021

24 June 2021

23 June 2021

21 June 2021

20 June 2021

18 June 2021

15 June 2021

11 June 2021

10 June 2021

7 June 2021

6 June 2021

5 June 2021

4 June 2021

2 June 2021

1 June 2021

28 May 2021

27 May 2021

23 May 2021

22 May 2021

16 May 2021

10 May 2021

26 April 2021

8 April 2021

1 April 2021

31 March 2021

23 March 2021

17 March 2021

16 March 2021

11 March 2021

10 March 2021

5 March 2021

24 February 2021

23 February 2021

22 February 2021

15 February 2021

13 February 2021

29 January 2021

28 January 2021

27 January 2021

25 January 2021

24 January 2021

22 January 2021

21 January 2021

18 January 2021

13 January 2021