A Language Modeling Approach to Predicting Reading Difficulty-paer

阿新 • • 發佈：2018-12-21

統計 nor use 難度 lex ken desc 語義 nta

Volume:Proceedings of the Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics: HLT-NAACL 2004
Authors:Kevyn Collins-Thompson | James P Callan 、
Year:2004
Venues:NAACL | HLT

數據不公開：
550英文document，12個等級，448715個token，17928個type，來自不同主題

1 introduction
公式的方法~線性回歸模型
我們的統計模型~
1）捕捉每個單詞的更細節的特征~我們在更短的文章甚至小於10個單詞時，準確率也很高
2）統計的方法可以獲得概率分布，而不僅僅是一個預測

2 Description of Web Corpus
token定義為任何一個word的出現
type定義為一個word字符串，無論出現多少次也只算一次
數據：550英文document，12個等級，448715個token，17928個type，來自不同主題
我們的假設是：即使文本內容的主題不一樣，單詞的使用模式和文本的難度是有明顯關系的

3 Related Work
之前的可讀性評價依賴於兩個主要因素：
1）the familiarity of semantic units(words or phrases)語義單元的熟悉度，如word或短語
2）the complexity of syntax. 句法的復雜
最為常用的是‘vocabulary-based measures’：
使用一個單詞列表來估計語法難度，而不是number of syllables in a word，例如以下都是用單詞列別的一些類型來估計語法難度
the Lexile measure (Stenner et al., 1988)
the Revised Dale-Chall formula (Chall and Dale,1995)
the Fry Short Passage measure (Fry, 1990).
--Lexile (version 1.0) uses the Carroll- Davies-Richman corpus of 86,741 types (Carroll et al., 1971);
--Dale-Chall uses the Dale 3000 word list;
Fry‘s Short Passage Measure uses Dale & O‘Rourke‘s
--‘The Living Word Vocabulary’ of 43,000 types (Dale
and O‘Rourke, 1981)

和Si and Callan(2001)這篇最早的也是唯一的使用語言模型的方法相比：
2001：只使用了science一個主題，分為3個難度，貝葉斯，沒有實現特征選擇方法的分析，所以並不知道它們的分類是是否將話題預測和難度預測混為一談
我們：不限主題，12個難度等級，訓練集更大，也使用貝葉斯，但是每個類別並不是獨立的，我們使用了混合等級模型，大大提高了準確率。也沒有把句子長度作業一個句法成分。測試了特征提取以及模型的泛化能力

4 The Smoothed Unigram Model

A Language Modeling Approach to Predicting Reading Difficulty-paer

統計 nor use 難度 lex ken desc 語義 nta Volume:Proceedings of the Human Language Technology Conference of the North American Chapter of the Ass

論文解讀：Ask Your Neurons: A Neural-based Approach to Answering Questions about Images

這是關於VQA問題的第三篇系列文章，這篇文章是一篇比較經典的文章，所以跟大家分享。本篇文章將介紹論文：主要思想；模型方法；主要貢獻。有興趣可以檢視原文：Ask Your Neurons: A Neural-based Approach to Answering Questions abo

論文閱讀 | DeepDrawing: A Deep Learning Approach to Graph Drawing

作者：Yong Wang, Zhihua Jin, Qianwen Wang, Weiwei Cui, Tengfei Ma and Huamin Qu 本文發表於VIS2019, 來自於香港科技大學的視覺化小組(屈華民教授領導)的研究 1. 簡介圖資料廣泛用於各個領域，例如生物資訊學，金融和社交網路分析。

A Bayesian Approach to Deep Neural Network Adaptation with Applications to Robust Automatic Speech Recognition

機器學習屬於瓶頸特征 oid ack enter 變換表示基於貝葉斯的深度神經網絡自適應及其在魯棒自動語音識別中的應用直接貝葉斯DNN自適應使用高斯先驗對DNN進行MAP自適應為何貝葉斯在模型自適應中很有用？因為自適應問題可以視為後驗估計

Understanding Feature Engineering (Part 4) — A hands-on intuitive approach to Deep Learning Methods

Introduction Working with unstructured text data is hard especially when you are trying to build an intelligent system which interprets and understa

Brave new world: A creative approach to AI

Machine learning offers considerable potential for streamlining and transforming routine information processes in life sciences -- but could more creative

How to adopt a strategic approach to AI projects

California‑based Farmers Insurance has invested aggressively in AI in recent years. One project frees up time for claim adjusters by using image recognitio

An NLP Approach to Mining Online Reviews using Topic Modeling (with Python codes)

An NLP Approach to Mining Online Reviews using Topic Modeling (with Python codes)E-commerce has revolutionized the way we shop. That phone you’ve been savi

Business genius can be taught, study says: The academic approach to business strategy needs a jolt of future focus

Luis Martins, director of the Herb Kelleher Center for Entrepreneurship at The University of Texas at Austin's McCombs School of Business, and co-author V

Tutorial: Write a Finite State Machine to parse a custom language in pure Python

1. Analyze the structureFirst a couple simple examples of the POSH Syntax one per line (3 examples):VB(noise+3)NNS(acoustics) & RB(not)(NNS(acoustics)

A Young Student’s Approach to Polygon Areas

A Young Student’s Approach to Polygon AreasSharing the work of Sameer Sharma, age 16.Consider the sequence of regular polygons of side length 1. Actually,

文獻閱讀--A systematic approach to identify novel cancer drug targets using machine learning, inhibitor

最近找了一些，預測腫瘤藥物靶點的文獻看看，這篇我挺感興趣。我主要閱讀了靶點預測部分，一些專業的東西還不理解，暫粗淺的記錄下用機器學習演算法，找新的癌症藥物靶點中心思想：用已知的訓練集學習得出一個分類器（模型），再對未知的資料集進行分類特徵

Database Systems--- A practical Approach to Design, Implementation, and Management Fifth Edtion

　　在2010年冬天的時候，我們前單位（晉能信工）把我們送到了北大青鳥培訓了兩個月，最後，我還是沒有考試過了。我覺得我不是學習計算機的一塊料。一恍，十年過去了，我也不知道我也幹什麼的一塊料。被命運牽著鼻子走。我是一個膽小懦弱而又不善言辭的人。一味地去聽別人，看別人。從來也不知道自己是一個什麼人，

[譯]A Bayesian Approach to Digital Matting

最近在看關於Matting的文章，這篇論文算是比較經典的老論文了，所以翻譯過來，閱讀更加方便些。文章翻譯大部使用谷歌線上翻譯，對其中小部分錯誤進行了修正。 A Bayesian Approach to Digital Matting 1、Introduction In

A Deep Neural Network Approach To Speech Bandwidth Expansion

題名：一種用於語音頻寬擴充套件的深度神經網路方法作者：Kehuang Li；Chin-Hui Lee 2015年出來的摘要　　本文提出了一種基於深度神經網路(DNN)的語音頻寬擴充套件(BWE)方法。利用對數譜功率作為輸入輸出特徵進行所需的非線性變換，訓練神經網路來實現這種高維對映函式。在10小

malloc: * error for object 0x6080000bd200: Invalid pointer dequeued from free list * set a breakpoint in malloc_error_break to debug

版本解決 pointer 系統解決方法 all list object ued 在集成第三方sdk的時候碰到這個問題， malloc: *** error for object 0x6080000bd200: Invalid pointer dequeued from

A Language Modeling Approach to Predicting Reading Difficulty-paer

A Language Modeling Approach to Predicting Reading Difficulty-paer

論文解讀：Ask Your Neurons: A Neural-based Approach to Answering Questions about Images

論文閱讀 | DeepDrawing: A Deep Learning Approach to Graph Drawing

A Bayesian Approach to Deep Neural Network Adaptation with Applications to Robust Automatic Speech Recognition

Understanding Feature Engineering (Part 4) — A hands-on intuitive approach to Deep Learning Methods

Brave new world: A creative approach to AI

How to adopt a strategic approach to AI projects

An NLP Approach to Mining Online Reviews using Topic Modeling (with Python codes)

Business genius can be taught, study says: The academic approach to business strategy needs a jolt of future focus

Tutorial: Write a Finite State Machine to parse a custom language in pure Python

A Young Student’s Approach to Polygon Areas

文獻閱讀--A systematic approach to identify novel cancer drug targets using machine learning, inhibitor

Database Systems--- A practical Approach to Design, Implementation, and Management Fifth Edtion

[譯]A Bayesian Approach to Digital Matting

A Deep Neural Network Approach To Speech Bandwidth Expansion

malloc: * error for object 0x6080000bd200: Invalid pointer dequeued from free list * set a breakpoint in malloc_error_break to debug

【論文:麥克風陣列增強】An alternative approach to linearly constrained adaptive beamforming

cannot be translated into a null value due to being declared as a primitive type. Consid

Convert a given binary tree to double linked list

asp.net: what's the page life cycle order of a control/page compared to a user contorl inside it?

A Language Modeling Approach to Predicting Reading Difficulty-paer

相關推薦