Word Embeddings - An Alternative and Efficient Approach to Search for Documents


3rd December 2020
Speaker Date: 3rd December 2020
Speaker Time: AEDT: 16:00-17:00
Attendee Date: 3rd December 2020
Attendee Time: IST: 10:30-11:30 | SGT: 13:00-14:00 | AEDT: 16:00-17:00
Duration: 50 mins
Ananth Gundabattula
Senior Architect, Commonwealth Bank of Australia

Searching for documents in a collection is typically implemented via a TF/IDF principle in open source document search engines. However recent developments in the field of NLP has shown positive results in representing text into more concise vector representations as opposed to a bag of words construct. In addition to this, these approaches also add richness to the information models like taking care of analogies and semantics of the words. This talk would walk through an end to end data workflow to enable such a construct.

The first part of the session would describe the typical flow of how a search query is processed by default in any of the lucene powered search engines today. The concept of TF/IDF is also introduced in this part of the session.

The session then proceeds to describe the concept of word embeddings using a library like Facebooks fasttext.

Subsequently, a representative data pipeline is discussed as to how an incoming stream of data can be turned into vector representations and made amenable for searching with a few seconds of turn around time.

The session would close with a few references to the more recent developments in this space.

You may also be interested in

3rd, September

Time
Developer is 'King' - Unleashing Innovation by Unblocking your Developers

As each industry is disrupted by the wave of digital transformation, harnessing and unlocking new ideas can only be done...

26th, November

Time
Designers + Developers = Best Friends Forever?

How is the relationship between your design team and your development team? Is it highly functional? Or 'just professional'? Maybe...

26th, November

Time
Effecting Change—The Art Of Leading Teams

As leaders we want our teams to pursue great ideas and change directions to realize the goals. However, often we...

26th, November

Time
Identifying And Removing Impediments

Executing a software project has many challenges. For a team to function smoothly and deliver working solution we have to...

3rd, September

Time
Uncovering your Personal Values

We regularly review our code and attend retros, but what about tracking and reviewing our personal identity? We all know...

26th, November

Time
Beyond Managing Your Manager

The deep-dive presents why conflicts with your manager are inevitable based on differences in priorities and perspectives, and how to...