Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Open sourcing Embedding Projector: a tool for visualizing high dimensional data (googleblog.com)
134 points by runesoerensen on Dec 7, 2016 | hide | past | favorite | 14 comments


I'm really happy to see this. I demonstrated an (in some ways) more advanced version of this a few years ago in my talk on TED.com - see the last few minutes of http://ted.com/talks/jeremy_howard_the_wonderful_and_terrify...

Unfortunately back when I was at Enlitic we never got around to open sourcing this, but I hoped my demo would encourage people to explore tools built on projections. There's great potential to rapidly label data and improve models using these kinds of tools. I'm certainly hoping to find time to return to this myself sometime soon.


Do you know what the best options are currently for the labelling part? I'm looking for a tool to highlight phrases to train an entity recognition. IBM has something, but it's pretty expensive, and I'm not sure what this would be called.


I'm not aware of any systems that I think are much good. It's a big gap.


Not entirely sure what you are looking for, but have you seen https://demos.explosion.ai/displacy-ent/?


This is right sort of UI. I want to be able to highlight text and say what type of entity is selected to train a new model.

It needs to be more of a workflow based system though, so you can upload 10s of thousands of documents and tag them as quickly as is reasonable.


I think BRAT can do that: http://brat.nlplab.org/index.html

I've never used it for doing the annotation myself though.

There are a few other possible tools here: https://omictools.com/text-annotation-category


This looks like what I need, thanks!


Looks cool, but couldn't quickly try it out. The website for Projector caused Chrome to crash- it showed a "WebGL hit a snag" error.

http://projector.tensorflow.org/


Thanks for the feedback. What version of chrome and what OS are you using?


Version 54.0.2840.99 m on Windows 10.


I was able to run this on Firefox Developer Edition on linux.


true that, Chrome keeps crashing


Is there a TF project to build (word) embeddings suitable for this? Gensim is so easy to use, but the added flexibilty of TF could be useful.


I couldn't find one, but from what I gather from the docs[1] you can give it basically any tsv file of high dimensional data. You can have a dig around in the repo here[2].

[1]https://www.tensorflow.org/versions/master/how_tos/embedding... [2]https://github.com/tensorflow/tensorflow/tree/master/tensorf...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: