Channel: PyData
Category: Science & Technology
Tags: pythonlearn to codeeducationsoftwarepydatalearncodinghow to programjuliaopensourcescientific programmingnumfocuspython 3tutorial
Description: Every data scientist is familiar with the tedious task of investigating a new data source (e.g. a database dump / logs) and looking for familiar columns names and tables in this pile of data. Ran demonstrate a solution to column semantic matching can help facilitate the exploration phase. Ran demonstrated how pretrained word-embdeddings, an approximate nearest-neighbor search and Levinshten distance can help.