1 November 2024

GenAI, AGI, and Superintelligence

Artificial Intelligence has come along way. However, AI still faces a significant challenge and several roadblocks. They can assist us in mundane tasks but not completely replace humans in many tasks that require complex learning and adaptability. The below highlight the primary three areas of AI advancement stages. We are currently in the first GenAI stage of enlightenment. There is still a long way to go before AI becomes a real competitive contender to a human and even to surpass those human limits.

Generative AI (GenAI):

  • Focus: Content generation in audio, video, image, code, and text in a specific application
  • Capabilities: Trained on huge datasets to generate near human-replicated content
  • Current State: Advanced use cases across industry for various applications

Artificial General Intelligence (AGI):

  • Focus: Human-level intelligence for generalizable tasks
  • Capabilities: Ability to understand and learn any human task
  • Current State: Not yet achievable with ongoing research progress

Superintelligence:

  • Focus: Ability to go beyond human intelligence and abilities
  • Capabilities: Able to do things beyond human comprehension and abilities
  • Current State: Not yet achievable, still in theoretical stage

Enhanced GenAI With Knowledge Graphs

Enhanced GenAI With Knowledge Graphs 

GenAI For Beginners

GenAI For Beginners

Dreams

Dreams 

31 October 2024

Java vs Go vs C vs C++

The below highlight the key areas where the various programming languages are used and their summarized characteristic differences. 

Java

  • Syntax: More verbose, object-oriented
  • Concurrency: Thread-based
  • Memory Management: Garbage Collection
  • Ecosystem: Mature, extensive libraries and frameworks
  • Performance: Generally slower startup time, but good performance at runtime
  • Learning Curve: Steeper learning curve
  • Use Cases: Enterprise Application, Android Development, Big Data and Data Science

Go
  • Syntax: Concise, more procedural
  • Concurrency: Goroutines and channels
  • Memory Management: Garbage collection
  • Ecosystem: Growing, but less mature than Java
  • Performance: Faster compilation and runtime
  • Learning Curve: Easier to learn
  • Use Cases: Microservices Architecture, Cloud-Native Applications, Network Programming and    Systems Programming, High-Performance Applications
C
  • Syntax: Low-level, procedural
  • Concurrency: Threads
  • Memory Management: Manual
  • Ecosystem: Smaller ecosystem, but focused on system-level programming
  • Performance: High performance, low-level control
  • Learning Curve: Steep learning curve
  • Use Cases: Systems Programming, Embedded Systems, Operating Systems
C++
  • Syntax: Complex, object-oriented
  • Concurrency: Threads
  • Memory Management: Manual
  • Ecosystem: Large, complex ecosystem
  • Performance: High performance, fine-grained control
  • Learning Curve: Steep learning curve
  • Use Cases: High-Performance Applications, Game Development, Scientific Computing

22 October 2024

Knowledge Graph Visualization

Knowledge Graph Visualization

What We Learned From A Year of LLM

What We Learned From A Year of LLM

Stable Diffusion 3

Stable Diffusion 3

What Future Holds For Gen AI

What Future Holds For Gen AI

GPT-4o Assistants

GPT-4o Assistants

How To Tell Which LLM Model Is Best

How To Tell Which LLM Model Is Best

Google Tiny Model

Google Tiny Model

Transformer Explainer

Transformer Explainer

Grok Image Generator

Grok Image Generator

From GenAI to Multi-Agents

From GenAI to Multi-Agents

TorchTune

TorchTune

Large Language Model Agents

Large Language Model Agents

Understandable Novel Architecture

Understandable Novel Architecture

Llama

Llama

AI Crackdown

AI Crackdown

AI Can Only Do 5% of Jobs

AI Can Only Do 5% of Jobs 

Hollywood Dubbing

Hollywood Dubbing

Movie Gen

Movie Gen

AI Oversight

AI Oversight

Preserving Memories

Swarm

Swarm

Open Canvas

Open Canvas

Parallel Web

Parallel Web

19 October 2024

Data Entry

One of the simplest jobs in adminstration is of a data entry. The primary skills required are accuracy to detail and typing skills. Hiring an irresponsible person can mean applications get rejected because of sticky finger errors. Applications can imply anything that require a form. This can range from loans, bank account forms, immigration/visa forms, job forms, vetting forms, insurance claims, health forms, academic forms, and other applications. Incorrect entry of customer details can mean non-compliance to GDPR as it is all about storage and processing of correct personal information. It also means perfectly good applications get rejected, with bad service, and a very frustrated customer. Even replacing human data entry with AI can further compound the issue with an element of complexity if the model has high false-positives and false-negatives. The below outline some examples where system and human errors lead to incorrect processing, storage, and ultimately rejection of applications:

  • When proof of documents have numbers in different currencies and the system says they don't match
  • When a number is called out as incorrect even after providing proof of documentation with the correct number on it, even worse if the same proof is provided multiple times
  • Making assumptions about what the customer meant rather than what is actually on the form
  • Incorrectly inputting the correct details from the form
  • Giving the customer the runaround to provide documentation which has already been provided multiple times
  • Unable to use basic common sense when processing customer data forms
  • Flagging a customer based on an incorrect data input then blaming the customer for it
  • Incorrectly inputting the form data while half asleep
  • Losing the customer form and expecting the customer to fill out another form
  • Asking for the same information over and over without bothering to read the document
  • Confusing one application form with another
  • Rejecting an application by incorrectly mixing up details across two separate application forms
  • Providing customer with reasons to decline the application that relate to another customer
  • Sharing one customer details with another customer in the feedback
  • Trying to correct the spelling of someone's name as if the customer would not know how to spell their own name
  • Inputting month before the day, and day before the month without bothering to check for correctness between the form and the system
  • Leaving out critical information from the form
  • Badly worded forms get filled incorrectly which means lots of declined applications
  • Some of the most irresponsible and negligent are credit reference agencies who don't even have the basic systems in place to properly resolve, update, verify, validate, store, and process customer data accurately. Individuals as members of public have to chase up for errors and corrections on simple things like names and addresses which shows how credit reference agencies fail in their most basic role of proactively practicing due care for the protection of customer data.  

W3C

W3C has been a long standing consortium for the development and support of web standards. However, in many ways it has also been a hinderance to the community and the uptake of standards in development.

Benefits:

  • A way for community to come together, collaborate, and research on improving web standardization efforts
  • Shape the future of how web is used
  • Connect with thought leaders across the world
  • Develop consistency, accessibility, compatibility across the community

Drawbacks:
  • The community members tend to be academic and often very arrogant in their selective interaction and collaboration
  • Community members tend to be racist and discriminatory
  • Practical compliance and ethics often seems as an afterthought
  • A disconnect between academic members vs industry members
  • A disconnect between cultural differences across the spectrum of web standards
  • Majority of academic members are biased and lack basic ethics
  • Lots of favoritism for selective academic members especially for specific sponsored members
  • Processes, tools, and methods are antiquated
  • Standardization efforts are slow moving and lack basic practical insights from industry
  • Collaboration and communication is often discriminatory in nature, don't be suprised if the person on the other end assumes you are a clueless buffoon, showcasing an unapproachable attitude of a lot of members within the community
  • Egotistic and arrogant members ruins collaborations which leads to a dwindling community of active members
  • A lot of the W3C standards are not in favor anymore in industry or are outdated
  • Web standardization efforts is not progressing fast enough to keep up with the ML/DL community

18 October 2024

ChatGPT vs Gemini

ChatGPT and Gemini are both widely popular Large Language Model chatbots that have gained significant interest in the AI community. ChatGPT was built by OpenAI. And, Gemini was built by Google. However, both respectively have their strengths and weaknesses. Both model chatbots provide a free playground as well as paid options.

Architecture and Training:

Gemini was trained on a broader and more extensive dataset that allows for more flexibility across modalities for understanding and generation with more complex reasoning for problem-solving.

ChatGPT has been trained on standard text data for more creative generation of content. This is likely to  evolve further across the different versions to include multiple modalities.

Advantages and Disadvantages:

Gemini provides better coding ability, smoother conversation flow, enhanced understanding of complex concepts, and more problem-solving ability. While it sort of lacks on the creative front. It is also newer relatively in comparison to ChatGPT.

ChatGPT provides better creative skills for generation that require less understanding and complex reasoning. It has also iterated and evolved over time in different versions.

Use Cases:

Use Gemini when you want a code assistance and more granular reasoning with content. Use ChatGPT when you want to enhance your creative prowess. However, one common caveat across both chatbots is the questionable ethics, bias, and privacy controls. To err on side of caution, don't share personal data with the chatbots.

ChatGPT

16 October 2024

Schema.org

Schema.org is useful markup to have on website as it makes it search engine friendly while helping them understand the content and internal structure which enables better search results. However, the schema.org website lacks clarity and is difficult to navigate like a clutter of information.

Benefits:

  • Enables search engines to better understand content on sites that rank higher on search results
  • Improve click-through rates and organically increase traffic on site 
  • Provide more flexibility and context to how sites appear in search results
  • Increase user search relevancy
  • Improve strategy for content and context
  • Improve user experience
  • Flexibility on markup from microdata, rdfa, and jsonld
  • Provides a meta vocabulary to define the context of the site
  • Extract who, what, when, why from sites 

Drawbacks:

  • Unfriendly schema.org community for suggestions, feedback, and improvements
  • Submitting new changes or schemas is slow and often fraught with frustration 
  • Terrible and difficult to navigate schema.org site as the information is cluttered
  • Community is not very open and unwelcoming to new users
  • No real reasoning and significant effort towards web of data queryability 
  • Community is discriminatory towards user suggestions, submissions, and approval process 
  • Very opinionated and closed community which makes it unconstructive 
  • Huge Google bias with often rude and arrogant community members 
  • Markup often is buggy, flawed, and inflexible to community changes 
  • Process is fraught with trial and error
  • Difficult to develop a strategy around the markup 
  • Difficult to implement at scale with larger websites 
  • Maintaining markup is a challenge 
  • It is subjective and questionable whether the markup significantly improves discoverability 
  • Limited tools that support and provide insights into the markup 
  • Inflexible schema.org developer community makes the standard inaccessible, inextensible, and unmaintainable 
  • Unclear documentation on the schema.org website 
  • The markup is still very limited in context and scope especially for larger websites 
  • Lacks sufficient domain coverage as a markup

Although, the project has a long history and many websites make use of such markup, it has significant drawbacks that justify alternative efforts. The project is also Google sponsored with a significant corporate bias which defeats the merit and accessibility for an open community engagement. The often slow process means the markup lacks speed in reaching its full potential. An active open community may speed up the process but this is likely to be a significant roadblock from the existing community of developers who are not very forthright with community engagement. Bugs in schemas take a very long time to resolve and usually recommendations are not appreciated in the community. There is significant concern for websites to use such markup where the community is often unapproachable and inflexible. After all, it is supposed to be a web standard which arises from a community effort and engagement. Lastly, a markup that lacks readability, reasoning, and trust as a web standard is likely to be insufficient for the spectrum of web crawling, semantic search, web of data, and AI in general.

13 October 2024

Nostr

Nostr

Netflix Terrible Recommendations

Netflix is all hype. The quality of recommendations and personalization is terrible and lacks variety of content. The entire flow of rating system looks flawed. And, when you refresh the browser the same content that you thumbs down on reappears at the very top. The infinite scrolling is annoying. The algorithm does not personalize to watch habits of a user. Majority of the recommendations seem to be new and irregular. 

Netflix has an inaccurate and insufficient data collection gathering process which leaves an incomplete dataset for a recommendation model. The model is neither sufficiently able to gather how you use the platform nor how you don't use it while ignoring user interests and intents.

Netflix algorithms try to measure correlation but not sufficiently causation. It is not able to answer "why". This is in fact the whole point of a recommendation algorithm to utilize insights in order to make deeper contextual decisions on personalization to match items to users. 

Netflix algorithms lack sufficient reasoning skills to understand the habits of the user to provide better recommendations in respect of context, intents, and interests. The connections drawn between two points of data seem blurred. This may be as a result of tastes that are not fixed but tend to change. Unfortunately, there is also lack of filters for the user to provide additional data. This could include preferences as part of user profile. The fundamental filtering attributes of grouping items with the user in context seems to be missing for quality recommendations.  There is also little to no common sense in the recommendations. It often seems like the user will be trapped in a bubble of sorts. Also, one would assume that they would also recommend their own produced content to recuperate production costs through the platform and take benefit from user data. Furthermore, this may even provide insights on future production projects.

Finally, there is a significant lack of content variety on the platform. A huge bias towards Indian content over other regional content. The content library needs a complete reboot with more flexibility on user preference filters. And, an increase in frequency of new content to be able to sustainably compete with other streaming platform providers.

19 September 2024

Worst Regions for Employment

The following regions have the worst employers where the chances of getting exploited are high with such low pay that they expect you to almost work for free:

  • India
  • Israel
  • Saudi Arabia
  • Kuwait
  • Brazil
  • Bangladesh
  • Belarus
  • Ecuador
  • Egypt
  • Philippines
  • Colombia
  • Guatemala
  • Honduras
  • Romania
  • Mexico
  • Nigeria
  • Sudan
  • Madagascar
  • Costa Rica
  • Venezuela
  • Bulgaria
  • Myanmar
  • Tunisia
  • Algeria
  • Iraq
  • Yemen
  • Afghanistan
  • Nepal
  • Bhutan
  • North Korea
  • Moldova
  • Ukraine
  • Georgia
  • North Macedonia
  • Albania
  • Burkina Faso
  • Chad
  • Ethiopia
  • Uganda
  • Somalia
  • Djibouti
  • Libya
  • Mauritania
  • Ghana
  • Eritrea
  • Kenya
  • Senegal
  • Peru
  • Western Sahara
  • Liberia
  • Sierra Leone
  • Niger
  • Cameroon
  • Rawanda
  • Tanzania
  • Malawi
  • Mozambique
  • Angola
  • Namibia
  • Botswana
  • Zimbabwe
  • Zambia
  • Argentina
  • Bolivia
  • Chile
  • Uruguay
  • Paraguay
  • Cuba
  • Puerto Rico
  • Vietnam
  • Cambodia
  • Thailand
  • Laos
  • South Sudan

7 June 2024

Scam Jobs at Indian Companies

If you are ever approached by Indian companies in either Europe and USA, be warned. Most of their roles are placeholders. They advertise for scam jobs. Most of the work will likely be carried out in India on the cheap. The jobs they have in Europe and USA are mostly for window dressing for client engagement. Avoid at all cost. Unless you want to be stuck in an unhappy job with zero work/life balance within a terrible organization.

Things to notice:

  • Job descriptions that have catch all with a lot of excessive keyword stuffing
  • Interviewers are very passive, and at times aggressive, and not actively engaged
  • Interviewers that are overlooking the technical details when you ask probing questions
  • Interviewers that are clearly indicating that 'you won't be doing this'
  • Interviewers are very high-level about most things
  • Compensation is often brought up in every stage in an attempt to reduce and to ask you how flexible you are
  • Through the stages you will find out that the entire process is a scam to fill roles as placeholders
  • Through the stages you will find out that the role is not what is advertised, in fact the job title likely does not match the job description
  • Check the reviews, most likely other candidates have also been treated badly through the process
  • They often will seem in a rush to hire that is because they have likely not secured the funding for the project
  • Most Indian companies have significantly lower payscales that are below local market rates
  • Their entire process will be riddled with unprofessionalism
  • At times, the HR and interviewer engagement will also be quite rude and unapproachable
  • Be very careful to spot the deceptions, indian companies are notorious at lying both to clients as well as to the general public
  • Non-compliance issues with processing and storing of your personal info (some may even share your personal info with third-parties)
  • There is a reason why people talk so badly about indian companies in social media discussion boards, simply put huge number of people have had bad experiences

Sample companies that often advertise scam jobs on job boards:
  • Tata Consultancy
  • HCL
  • Wipro
  • Infosys
  • Mindtree
  • Mphasis
  • Mindspace
  • Tech Mahindra
  • NTT Data (although, a japanese company, specifically their indian heavy departments)
Increasingly, outsourcing means more and more companies are advertising scam jobs. Some USA and Europe companies that also have big indian hubs that advertise fake job ads:
  • Accenture
  • KPMG
  • Deloitte
  • Ernst & Young
  • CapGemini
  • Sopra Steria