Finland transmits, that...: 2018

Wednesday, December 12, 2018

Automatic writing with Deep Learning: Progress

This is a continuation of the post https://dmitrykan.blogspot.com/2018/05/automatic-writing-with-deep-learning.html. This item was reblogged at Writer's DZone: https://dzone.com/articles/automatic-writing-with-deep-learning-progress

Fast forward few months (apologies for the delay) I can share some findings.

Again, I think, we should take AI co-writer exercises with a grain of salt. However, during this time I have come across practical usage example areas for such systems.

One of them is augmentation of a news article writer. More specifically, when writing a news item, one of the most challenging tasks is to coin a catchy title. Does the title have some trendy phrases in it? Or perhaps it mentions an emerging topic, that captures attention at this given moment? Or reuses a pattern that worked well for this given author? Or just spurs an idea in the author's head?

In the following exercise I have set a very modest goal: train a co-writer on previously written texts with an attempt to suggest something useful from them. I could imagine, that this could be extended to texts that are trending or a collection of particularly interesting titles. What have you.

To train such a model I have used Robin Sloan's RNN writer: https://github.com/robinsloan/rnn-writer. The goodies of the project are:

Trained on Torch. Nowadays, Torch is leveraged via PyTorch, a deep learning Python library that is nearing its production readiness time.
The trained model gets exposed into an Atom -- pluginable editor (I'd imagine, real writers would want to have the model integrated into their favourite editor, like Word).
API is available too to integrate into custom apps (and this is exactly how it is integrated with Atom).

I will skip the installation of Torch and training the network and proceed to examples. The rnn-writer github repository has a good set of instructions to proceed with. I have installed Torch and trained the model on a Mac.

First things first: RNN trained on my Master's Thesis "Design and Implementation of Peer-to-Peer Network" (University of Kuopio, 2007).

The text of the Master's Thesis is about 50 pages in English with diagrams and formulas. On one hand, having more data makes NNs learn more word representations and should have larger probability space to predict next word given the condition of the current word or phrase. On the other hand, limiting the input corpus to phrases that have certain domain goal, like writing an email, could leverage a clean set of phrases that a user employs in many typical email passages.

As I got an access to Fox articles, I thought, this could warrant another RNN model and a test. Something to share next time.

Sunday, May 6, 2018

Automatic writing with Deep Learning: Preface

This article was also reblogged at: https://dzone.com/articles/automatic-writing-with-deep-learning-preface

Quite many machine and deep learning problems are directed at building a mapping function of roughly the following form:

Input X ---> Output Y,

where:

X is some sort of an object: an email text, an image, a document;

Y is either a single class label from a finite set of labels, like spam / no spam, detected object or a cluster name for this document or some number, like salary in the next month or stock price.

While such tasks can be daunting to solve (like sentiment analysis or predicting stock prices in realtime) they require rather clear steps to achieve good levels of mapping accuracy. Again, I'm not discussing situations with lack of training data to cover the modelled phenomenon or poor feature selection.

In contrast, somewhat less straightforward areas of AI are the tasks that present you with a challenge of predicting as fuzzy structures as words, sentences or complete texts. What are the examples? Machine translation for one, natural language generation for another. One may argue, that transcribing audio to text is also such type of mapping, but I'd argue it is not. Audio is a "wave" and the speech detection is an okay solved task (with state of the art above 90% of accuracy), however such an algorithm does not capture the meaning of the produced text, except for where it is necessary to do the disambiguation of what was said. Again, I have to make it clear, that audio->text problem is not at all easy with its own intricacies, like handling speaker self corrections, noise and so on.

Lately, the task of writing texts with a machine (e.g. here) caught my eye on twitter. Previously, papers from Google on writing poetry or other text producing software were giving me creepy feelings. I somehow undermined the role of such algorithms in the space of natural language processing and language understanding and saw only diminishing value of such systems to users. Again, any challenging tasks might be solved and even bring value to solving other challenging tasks. But who would use an automatic poetry writing system? Why would somebody, I thought, use these systems -- just for fun? My practical mind battled against such "fun" algorithms. Again, making an AI/NLProc system capable of producing anything sensible is hard. Take the task of sentiment analysis, where it is quite unclear what the agreement between experts is, not to mention non-experts.

I think this post has poured enough of text onto the heads of my readers. I will use this post as a self-motivating mechanism to continue the research with systems producing text. My target is to complete the neural network training on the text from my Master thesis and show you some examples for your judgement of the usefulness of such systems.

Saturday, May 5, 2018

AI for lip reading

It is exciting to push your imagination for where else can you apply AI, machine learning and most certainly -- deep learning, that is so popular these days. I came across this question on quora that provoked me to think a bit how would one go about training a neural network to lip read. I don't actually know what made me answer this question more: that found myself in an unusual context sitting on an Angularjs meetup at Google offices in New York City (after work, usual level tired) or the question itself. Whatever the reason, here is my answer:

Source: http://theconversation.com/our-lip-reading-technology-promises-to-make-hearing-aids-more-human-45166

I would probably first start with formalizing what is lip reading process from a human understandable algorithm point of view. May be it is worth to talk to a professional, like a spy or something. Obviously you need training data. Understanding, what is lip reading from the algorithm perspective will affect on what data you need.

To read a word of several syllables you’d need a sequence of anchor lip positions, that represent syllables. Or probably vowels / consonants. See, I don’t know, which one is best. But you’d need to start with the lowest level possible out of which you can compose larger sequences, like letters -> syllables -> words. Let’s call these states.
A particular lip posture (is that the right word?) will most probably map to ambiguous states.
Now the interesting part is how to resolve the ambiguities. Number 2 produces several options. Out of these you can produce a multitude of words that we can call candidates.
Then you need to score candidates based on some local context information. Here it turns into a natural language understanding.
I'd start with seq2seq.

Tuesday, January 16, 2018

New Luke on JavaFX

Hello and Happy New Year to my readers!

I'm happy to announce release of completely reimplemented Luke -- using JavaFX technology. Luke is the toolbox for analyzing and maintaining your Lucene / Solr / Elasticsearch index on low level.

The implementation was contributed by Tomoko Uchida, who also did the honors of releasing it.

The excitement of this release is supported by the fact, that in this version Luke becomes fully compliant with ALv2 license! And it gets very close to be contributed to Lucene project. At this point we need lots of testing to make sure JavaFX version is on par with the original thinlet based one.

Here is how load index screen looks like in new JavaFX luke:

After navigating to the Solr 7.1 index and pressing OK, here is what luke shows:

I have loaded an index of Finnish wikipedia with 1,069,778 documents, and luke tells me that the index does not have deletions and was not optimized. Let's go ahead and optimize it:

Notice, that on this dialogue you can request only expunging of deleted docs, without merging (the costly part for large indices). After optimization's complete, you'll have a full log of actions in front of you to confirm the operation was successful:

You could also opt for checking the health of your index via Tools -> Check index menu item:

Let's move to the Search tab. It has changed slightly in that search box has moved to the right, while search settings and other knobs were moved to the left.

Thinlet version:

JavaFX version:

It is more intuitive UI now in terms of access to various tools like Analyzer, Similarity (now with access to parameters of new BM25 ranking model, that became default in Lucene and default in luke) and More Like This. There is a new Sort sub-tab that lets you choose a primary and secondary field to sort on. Collectors tab however is gone: please let us know, if you used it for some task -- would love to learn.

Moving on to the Analysis tab, I'd like to draw your attention towards really cool functionality of loading custom jars with your implementation of a character filter, tokenizer or token filter to form your custom analyzer. Test these right in the luke UI without the need to reload shards in your Solr / Elasticsearch installation:

Last, but not least is Logs tab. Essentially you should have been missing it for as long as luke exists: getting a handle of what's happening behind the scenes during an error case or a normal operation.

In addition, this version of Luke supports the recently released Lucene 7.2.0.

Finland transmits, that...