Tietoja
Artikkelit kirjoittajalta Dmitry
Toiminta
3 t. seuraajat
Experience & Education
Lisenssit ja todistukset
Vapaaehtoistoiminta
-
Author and Translator
NLPub
Tiede ja tekniikka
https://nlpub.ru/
-
Session Chair, Program Committee Member & Reviewer
Various Conferences
- 5 vuotta 4 kuukautta
Tiede ja tekniikka
Acted as PCM. Chaired Conference Sessions. Peer reviewed conference research papers. Topics included: neural networks, aspect-based sentiment analysis, domain specific multi-word term extraction, fuzzy duplicate detection, topic models for social media, word semantic similarity, POS tags disambiguation, natural language processing.
Conferences:
◦ AINL: Artificial Intelligence and Natural Language Conference (June – October 2016) – ainlconf.ru
◦ ISMW FRUCT Conference (June – July…Acted as PCM. Chaired Conference Sessions. Peer reviewed conference research papers. Topics included: neural networks, aspect-based sentiment analysis, domain specific multi-word term extraction, fuzzy duplicate detection, topic models for social media, word semantic similarity, POS tags disambiguation, natural language processing.
Conferences:
◦ AINL: Artificial Intelligence and Natural Language Conference (June – October 2016) – ainlconf.ru
◦ ISMW FRUCT Conference (June – July 2016) – ismw-fruct.spbu.ru
◦ AIST: Analysis of Images, Social Networks and Texts Conference (March 2016) – aistconf.org
◦ AIST: Analysis of Images, Social Networks and Texts Conference (March – April 2015) – aistconf.org
◦ 2nd International Symposium on Business Modelling and Software Design (July 2012) – is-bmsd.org/Documents/ProceedingsOfSecondBMSD.pdf
◦ 6th International Conference on Software and Data Technologies (July 2011) – chaired the Knowled-Based Systems track session, organized process of paper presentations and moderated discussions.
Julkaisut
-
https://dzone.com/articles/will-deep-learning-make-other-machine-learning-alg
DZone / AI Zone
Katso julkaisuWe'll talk about deep learning and its role in making other state of the art machine learning methods obsolete.
-
Fun With Google Machine Translation
DZone / AI Zone
Katso julkaisuGoogle has switched its translation system over to a neural network implementation, and sometimes, the answers can be quite humorous!
-
Lightweight Java Profiler and Interactive svg Flame Graphs
Own blog
Katso julkaisuThis blog posts walks you through setting up the Flame graphs toolkit and illustrates its usage for profiling an Apache Solr instance.
-
Weka проект для задачи распознавания тональности (сентимента)
Хабрахабр
Katso julkaisuВ статье представлен полноценный проект на Java для анализа тональности в текстах (на примере английского языка) с хорошим качеством: более 83% при несбалансированном тренировочном сете и более 84% при сбалансированном.
-
Low-level testing your Lucene TokenFilters
Own blog
Katso julkaisuOn the recent Berlin buzzwords conference talk on Apache Lucene 4 Robert Muir mentioned the Lucene's internal testing library. This library is essentially the collection of classes and methods that form the test bed for Lucene committers. But, as a matter of fact, the same library can be perfectly used in your own code. David Weiss has talked about randomized testing with Lucene, which is not the focus of this post but is really a great way of running your usual static tests with…
On the recent Berlin buzzwords conference talk on Apache Lucene 4 Robert Muir mentioned the Lucene's internal testing library. This library is essentially the collection of classes and methods that form the test bed for Lucene committers. But, as a matter of fact, the same library can be perfectly used in your own code. David Weiss has talked about randomized testing with Lucene, which is not the focus of this post but is really a great way of running your usual static tests with randomization.
This post will show a few code snippets, that illustrate the usage of the Lucene test library for verifying the consistency of your custom TokenFilters on lower level, than your might used to. -
Using system disk cache for speeding up the indexing with SOLR
DZone
Katso julkaisuThe article experiments with benchmarking of batch indexing xml files with Apache Solr. It concludes, that using the system caching before the indexing is conducted can increase the indexing speed.
-
Implementing own LuceneQParserPlugin for Solr
DZone
Katso julkaisuThis article goes beyond simply implementing a query parser for solr. It shows how to implement your own plugin class from where the query parser is then extended. The beauty of the solution is in ease of deployment onto your solr instance without the need to modify the solr source code.
-
Monitoring Solr with Graphite and Carbon
DZone
Katso julkaisuMonitoring of distributed systems is essential in observing and predicting system loads. The article explains how to set up monitoring of the popular blazing fast search platform Apache Solr with Graphite and Carbon tools.
-
Rule-based approach to sentiment analysis at ROMIP 2011
Dialogue 2012
Katso julkaisuThis paper describes rule-based approach to sentiment analysis, that aims at shallow parsing of an input text in the Russian language and applying a set of linguistic rules for resolving a sentiment of a given chunk (subclause, sentence or text). The algorithm shows decent perfomance (90% precision for positive class) for the cases when annotators agreed on a sentiment label and has the feature of the text object related sentiment classification.
see the paper here (last accessed:…This paper describes rule-based approach to sentiment analysis, that aims at shallow parsing of an input text in the Russian language and applying a set of linguistic rules for resolving a sentiment of a given chunk (subclause, sentence or text). The algorithm shows decent perfomance (90% precision for positive class) for the cases when annotators agreed on a sentiment label and has the feature of the text object related sentiment classification.
see the paper here (last accessed: October 24, 2012)
http://www.dialog-21.ru/digests/dialog2012/materials/pdf/Kan.pdf -
Method for an Automatic Generation of a Semantic-level Contextual Translational Dictionary
ICSOFT (2) 2011
Katso julkaisuIn this paper we demonstrate the semantic feature machine translation (MT) system as a combination of two fundamental approaches, where the rule-based side is supported by the functional model of the Russian language and the statistical side utilizes statistical word alignment. The MT system relies on a semantic-level contextual translational dictionary as its key component. We will present the method for an automatic generation of the dictionary where disambiguation is done on a…
In this paper we demonstrate the semantic feature machine translation (MT) system as a combination of two fundamental approaches, where the rule-based side is supported by the functional model of the Russian language and the statistical side utilizes statistical word alignment. The MT system relies on a semantic-level contextual translational dictionary as its key component. We will present the method for an automatic generation of the dictionary where disambiguation is done on a semantic level.
see as well the poster (last accessed: October 24, 2012):
http://www.slideshare.net/dmitrykan/poster-method-for-an-automatic-generation-of-a-semantic
Kurssit
-
Image Analysis
-
-
Machine Learning (Stanford University)
-
-
Natural Language Processing (Stanford University)
-
-
Russir 2011
-
-
Startup Engineering (Stanford)
CME/CS184
-
Technology Entrepreneurship Part 1 (Stanford)
-
-
Web Of Data at RuSSIR 2011
-
Projektit
-
Advisor at Oppi.AI
-Lahja
Katso projektiAdvisory role at Oppi.AI (formely: SmallStep.AI) -- AI for education.
-
Apache Solr Enterprise Search Server - Third Edition
This book is a comprehensive resource for just about everything Solr has to offer, and it will take you from first exposure to development and deployment in no time. Even if you wish to use Solr 5, you should find the information to be just as applicable due to Solr’s high regard for backwards compatibility. The book includes some useful information specific to Solr 5.
The first chapter starts with the basic concepts and a tutorial and review of Solr’s web admin screens for those new to…This book is a comprehensive resource for just about everything Solr has to offer, and it will take you from first exposure to development and deployment in no time. Even if you wish to use Solr 5, you should find the information to be just as applicable due to Solr’s high regard for backwards compatibility. The book includes some useful information specific to Solr 5.
The first chapter starts with the basic concepts and a tutorial and review of Solr’s web admin screens for those new to Solr. The next seven chapters cover core information about using Solr, starting with the schema, then progressing to indexing and various ways to search or enhance your search. These chapters contain detailed reference information that you can flip back to as you employ Solr’s features, and they all start with “In a hurry?” tips to give you quick, focused help on the go.
Following these chapters, in the unique Integrating Solr chapter, you’ll learn how to use Solr with different languages and software before moving on and developing your understanding of performance and scaling including SolrCloud in the next chapter. Finally, the last chapter covers topics to help you put Solr into production, and as if all that wasn’t enough, we also provide a handy Solr parameter quick-reference sheet for you, which you can print and keep next to you on your desk.Muut tekijätKatso projekti -
Luke tool, Lucene toolbox
-Lahja
Katso projektiDeveloper, maintainer, keeping the toolset up to date with the recent releases of Apache Lucene, Apache Solr and Elasticsearch
Release manager
Popularizer -
PysaDroid
-
This is a hobby project, that I did for easing the transportation in the Helsinki metropolitan area.
It is pretty much a search app, with autocomplete features, saved searches and easy to use graphical interface
Java, Android, JSONMuut tekijätKatso projekti -
7Things
-
Katso projektiYou often come across useful information on the Internet. But how can you keep it?
You may keep what you want to remember in all kinds of on-line and off-line note-taking software, bookmarks and other tools for a while. But very soon they become overloaded with tones of stuff which makes it quite hard to find anything there.
We suggest saving everything directly to your memory. Simply select some text, share it with our application and it will make sure you remember it. -
MTEngine
-
MTEngine is an easy in use online tool for collaborative creation, improvement and evaluation of machine translation systems. It uses the natural language processing, statisitical modelling and linguistic rules for increasing the translation precision.
The project currently supports Russian->English direction, new language pairs are in the pipeline.Muut tekijätKatso projekti -
Kaggle StackExchange text prediction contest
-
Ran the machine learning experiments, found features that advanced the quality of prediction.
Muut tekijätKatso projekti
Kunnianosoitukset ja palkinnot
-
Speaker at the Future of Product Development 2018 (Helsinki)
Contribyte Oy
How to apply Machine Learning to Product Development.
Cases from fintech (AlphaSense) and edtech (SmallStep.ai)
https://tulevaisuudentuotekehitys.com/schedule/ -
Quorian
-
In the top 10 most viewed authors on #machine #translation and #computational #linguistics on quora.com
https://www.quora.com/Machine-Translation/writers?__snids__=1370377175&__nsrc__=2
https://www.quora.com/Computational-Linguistics/writers?__snids__=1370372825&__nsrc__=2 -
Winner at Stump the Chump 2013, first place (Lucene Revolution)
Lucene Revolution
Asked a winning question on Apache Solr technology
http://lucidworks.com/blog/stump-the-chump-dublin-winners/
(scored first) -
Winner at Solr Usability Contest 2013
-
Made 3 winning solr usability improvement suggestions:
1. make atomic updates really atomic
2. make dashboard more interactive and configurable
3. Add scripting capability
http://solrstart.uservoice.com/forums/216001-usability-contest
ANN is here: http://blog.outerthoughts.com/2013/08/wrap-up-of-solr-usability-contest/
My blog post about the contest:
http://dmitrykan.blogspot.fi/2013/09/solr-usability-contest-make-apache-solr.html -
Winner at Stump the Chump 2013 (Lucene Revolution)
Lucene Revolution
Asked a winning question on Apache Solr technology
http://searchhub.org/2013/05/06/stump-the-chump-winners/
Kielet
-
Russian
Äidinkielinen tai kaksikielinen
-
English
Kattava ammatillinen kielitaito
-
Finnish
Ammatillisen kielitaidon perustaso
-
Chinese
Perusteet
Recommendations received
14 people have recommended Dmitry
Join now to viewOther similar profiles
Muut nimetyt Dmitry Kan
7 muut nimetyt Dmitry Kan ovat LinkedInissä
Muiden nimeäminen Dmitry Kan