Pacling 2013: http://pacling.nak.ics.keio.ac.jp/ My Presentation: Paper: https://docs.google.com/file/d/0B2CohiuaB08MZlhHMkFqN0VSZ2s/edit?usp=sharing Slide: https://docs.google.com/file/d/0B2CohiuaB08MU1VxSWxDQ0FEaEk/edit?usp=sharing Comments and Questions: * What is the definition of "common sense knowledge" -> We define the predicate (verb, adjective, verbal noun) as common sense knowledge. * What corpus did you use? -> We used the Japanese Google 7-gram data. * How did you evaluate the results? -> We manually evaluated whether assigned predicates are correct or not. * What applications do you want to utilize common sense knowledge base for? -> We want to utilize common sense knowledge for various NLP applications such as a conversation system. * You can try using the case frame corpus constructed by Kyoto University: http://nlp.ist.i.kyoto-u.ac.jp/index.php?%E4%BA%AC%E9%83%BD%E5%A4%A7%E5%AD%A6%E6%A0%BC%E3%83%95%E3%83%AC%E3%83%BC%E3%83%A0 2013.09.02 Interest presentation Two Issues in Syntactic Parsing memo - Coordinate Structure: This information helps improve parsing accuracy -> DP matching method for alignment (path-based method) -> Parse trees produced by the grammar rules (tree-based method) * Tree structure can represent coordinate structure as a tree * Sum of all the scores of COORD/COORD nodes in the tree - Grammatical Units * Multiword Expressions (MWE) - Lexicalized phrases & Institutional phrases: collocations, named entity ? How to construct MWE Lexicon -> from Wikitionary ? How to construct MWE annotated corpus -> annotation of Penn Treebank ? What to do -> Dictionary of semi-fixed and syntactically flexible MWEs -> MWE annotated corpus construction -> Parsing with MWE dictionary * Complex sentence pattern -> Joing processing Investigation of clause pattern variations around "SBAR" pattern Extraction of SBAR patterns in auto-parsed English corpus and grouping them Corpus data: Hiragana Times (http://www.hiraganatimes.com/en/) Coordinate Structure: Did you try to a statistical approach? -> Yes, but we couln't get fine resutls. Thematic Representation of Short Text Messages with Latent Topics: Application in the Twitter context URL: http://mohamedmorchid.fr/articles/pacling2013_mohamed_morchid.pdf memo What's a merit of your output.Example: input and output Input: just a tweet Output: Add information which relate to a tweet in order to understand the tweet. How do you hack the Wikipedia article. -> time upe.g. In 1954, the NBA had no health benefits no penston plan, no minimum salary , and the average players salary was 8000$ a season Bursty Topics in Time Series Japanese / Chinese News Streams and their Cross-Lingual Alignment memo I want to see a recall evaluation. -> future plans We think that the birthday topic is a important event between Japan and Chinese. 2013.09.03 Interest Presentation Using Heterogeneous Features for Scientific Citation Classification memo Researchers are faced with ever increasing literature in all fields -> Help researchers to more efficiently distill knowledge from scientific citation networks Design of a Web-scale Japanese Corpus memo NICT: Japanese Syntactic Dependency Database Version 1.1. - 480 million sysntactic dependency relations in 600 million pages and 43 billion sentences Kyoto University: Kyoto-U Case Frames (Version 1.0) in 2009 Tsukuba-U: Tsukuba Web Corpus NDL: Web Archive Project JpTenTen11 Yata: Japanese Web Corpus 2010 Heritrix Crawler (Version 3.1) - Developed by Internet Archive (United State) - Used by national libraries (e.g. NDL in Japan) NWC (Nihongo Web Corpus) Toolkit Masuoka-Takubo POS target Kokugo-ken Short Unit & Kokubo-ken Long Unit (Chunker CRF++) I hope this corpus is published One issue is the copyright of original 2013.09.04 Interest Presentation Extraction of Drug Information using Clue Words from Japanese Blogs memo Extraction of medical information from patient's blogs * To get answers fro these questions * To help decision making TOBYO-jiten Okusuri110ban (NGO website with drugs information) --- 12,170 illness related nouns automatically retrieved.Does the recall of results is increased If you collect more examples? -> We think the precision is more important than the recall (Because we want to use this system for participants) So, the small collect data is also more important than big noisy data.Their approach accepts the negative expressions. Dependency-Based Method for Extracting Causes of Emotions memo Challenge: Recognition of implicit emotions from textNovel method for extraction of emotion causes from sentences How can I use the technology everyday in my life -> analysis for blogs, news and forums (for example, marketing emotions might be useful) in other language -> similar system is developed |