30 September 2017
Julia
Labels:
big data
,
Cloud
,
data science
,
distributed systems
,
julia
,
programming
,
software engineering
Nim
Labels:
big data
,
Cloud
,
data science
,
distributed systems
,
nim
,
programming
,
software engineering
Marketcetra
Labels:
big data
,
Cloud
,
data science
,
distributed systems
,
finance
,
programming
,
software engineering
29 September 2017
27 September 2017
25 September 2017
22 September 2017
Product Classification and Matching
- A Machine Learning Approach to Product Matching and Categorization
- e-Commerce Product Classification: cDiscounts 2015 Challenge
- Product Classification in e-Commerce Using Distributional Semantics
Kaggle Product Classification Challenges:
Otto Group Product Challenge
cDiscount Challenge
Semantics:
Global Product Classification
World Data Commons
Schema.org
GoodRelations
Product Classifications as Web Ontologies
Product Classification on Wikipedia
Silk Framework
Labels:
big data
,
data science
,
deep learning
,
ecommerce
,
machine learning
,
natural language processing
,
semantic web
,
text analytics
21 September 2017
Police Open Data
Police Open Data
Crime-Stats
US City Census
Types of Background Searches:
Crime-Stats
US City Census
Types of Background Searches:
- County Criminal History
- State Criminal History
- Federal Criminal History
- Sex Offender Check
- Bankruptcy Check
- Civil Litigation Check
- SSN Verification
- NI Verification
- Government WatchList
- ID Tracing
- Driving Record
- Academic / Professional Check
- Credit Report Check
- Drug Testing Check
- HealthCare Check
- National Criminal Databases
- Internet / Social Media Check
- Travel History Check
- Transaction Verification Check
- Video Footage Check
Semantics3
Labels:
big data
,
data science
,
deep learning
,
ecommerce
,
information retrieval
,
machine learning
19 September 2017
Loading Data to Redshift
- Using COPY command and SQL query via S3
- Stream data to Redshift using S3 with Kinesis, Kafka or via a dataflow process like Streamsets
7 September 2017
Downsides of Pair Programming
- Stifles creativity, innovation, experimentation, learning, and exploration
- Dependency on another (lunch/tea/holiday/bathroom breaks, start and end times)
- What do you do when your pairing partner has to leave early to pick kids up from school?
- More talk less code (talking out loud how many other people in office do that normally?)
- Lack of code ownership
- Loss of confidence - again dependency on another
- Higher risk of developer turnover and burnout
- The other developer can slow you down
- Limits out of the box thinking
- Annoying if developer has to be questioned about everything
- Internal management politics plays into pair programming
- Lack of developer autonomy means less responsibility and focus
- Sometimes neither pair programmer is able to solve the problem and need even more eyes (!?)
- Doesn't work in areas like data science and big data where there is a lot of out of box thinking and exploration
- Pair programming is an anti-pattern in agile workplace and doesn't make for a nimble programmer
- Your promotion, bonus, and delivery is dependent on the performance of the other programmer
- If senior programmer works with junior programmer invariably they wasting their time
- If two programmers work together of same skill-level they wasting their time
- Thinking out loud about everything can actually be counter-intuitive
- It makes a programmer less productive and efficient overall in being responsive to the needs of business
- The programmer (observer) that is watching the other programmer (driver) is likely to be half asleep part way through the day
- Pair programming feels like working in a sweat shop
- Conflicting egos and skills disparity
- Lack of privacy
- No Downtime
- Looking over the shoulder feels weird and distracting
- Increased burnout in workplace can be a health & safety risk
- Can turn into a tutoring session and discussion rather than actually completing work
- Pairing means you have to use the same developer tools, potentially tools you are less productive in
- Can two people not work better towards same goal separately?
- How difficult is it to communicate to the person next to you in your team without pairing all day?
- How many people do you see in an office doing pairing work (Finance, Marketing, Reception, etc)?
- With higher turnover and burnout risk means more time spent training and recruitment
- If your pair programming partner leaves company, time will be taken up adapting to the new partner and pair dynamics in team
- Can mean increased risk of redundancy if company or individual is not performing (have to let go of the pair rather than one programmer)
- Do you think out loud about other things in life when you not programming (most people would call it weird)?
5 September 2017
Graph-Based Taxonomy Finder
There are many ways of building out a semantic taxonomy of terms. One approach is using a graph representation with a machine learning approach. The below algorithm highlights a high-level approach to using a Word2Vec and Minimum Spanning Tree for building out a weighted graph for a taxonomy.
4 September 2017
Reactive Streams
Labels:
big data
,
data science
,
distributed systems
,
Java
,
JavaScript
,
microservices
,
scala
,
spring
1 September 2017
Subscribe to:
Posts
(
Atom
)