Two weeks ago we attended the “Practical conference about ML, AI & Deep Learning applications” – Machine Learning Prague 2019 from February 22nd – 24th, 2019.

We had a great time at the conference and were truly able to deepen our knowledge in the field of ML. Two of our most relevant areas of ML that we would like to cover are “Topic Modeling” (in this post) and “Anomaly Detection” in the second part (coming soon). Let’s get started!

Topic Modeling with Machine Learning 

We heard about a large variety of interesting applications of Machine Learning.  We heard a very interesting talk from Alexander Loosley at Data Reply called “Solving the Text Labeling Challenge with EnsembleLDA and Active Learning”. 

He discussed how to effectively identify and label topics within a huge corpus of text.  

We are interested in this topic at TheVentury because it has a wide variety of potential applications and it builds from our experience in Natural Language Processing. 

Discovering topics in social media posts 

One of the great things you can do through topic modeling would be to summarize what customers are saying through social media and automatically discover the most important message topics. 

This can guide you to address the most relevant issues for your customers and can also flag critical messages as they come in. 

In the presentation Alexander discussed the EnsembleLDA method to robustly identify topics. 

LDA stands for Latent Dirichlet Allocation which is a Machine Learning method commonly used to cluster data by topics. 

EnsembleLDA is the procedure by which LDA is repeated multiple times with different random starting points so that the most robust topics can be identified. This method gives confidence in the topic labels so that they can be used in analysis or other business applications.

Discovering topics in emails 

Topic Modeling can also be used to categorize and label emails.  For example, important information is contained in the many emails that are sent and received about a project or the company.  By using Machine Learning, emails are categorized by their major topics, and can be further categorized into smaller topics.

This allows you to categorize, explore, and summarize the relevant information in the emails. Super useful!

Using Topics to sort and search documentation 

Another application for Topic Modeling is to categorize internal documentation.  If you or your employees spend significant time searching through internal documentation, you can greatly improve this experience using Machine Learning.

We can make it easy to quickly find the correct document through keyword searches based on topics.

Visualization 

With any of these applications, it is important to be able to visualize and explore the data and topics.  Alexander showed some interesting visualization methods which help to understand the data, improve the methodologies, and produce the most accurate topic models.

It was a highly engaging and interesting talk!  Thanks again Alexander!

Next up

In my next blog post we will be discussing “Anomaly Detection” which was also widely discussed at the #mlprague! The general idea is to use past data to decide if new data are conforming or anomalous.  Watch out for it!

 

Does Topic Modeling sound like something your business could benefit from?

We are happy to have a talk about possible solutions that meet your needs.

Let’s talk: maximilian.unger@theventury.com