{"id":602,"date":"2016-03-22T02:37:29","date_gmt":"2016-03-22T02:37:29","guid":{"rendered":"http:\/\/course.oeru.org\/sia\/?page_id=602"},"modified":"2016-03-22T02:37:29","modified_gmt":"2016-03-22T02:37:29","slug":"knowledge-discovery","status":"publish","type":"page","link":"https:\/\/course.oeru.org\/sia\/course-content\/knowledge-discovery\/","title":{"rendered":"Knowledge Discovery"},"content":{"rendered":"<div id=\"content\" class=\"mw-body container\" role=\"main\">\n<div class=\"row\">\n<div class=\"col-md-12\">\n<div class=\"panel\">\n<div class=\"panel-body\">\n<div id=\"bodyContent\">\n<div id=\"mw-content-text\" lang=\"en\" dir=\"ltr\" class=\"mw-content-ltr\">\n<h2><span class=\"mw-headline\" id=\"Introduction\">Introduction<\/span><\/h2>\n<p>This topic explores how we can extract useful information and actionable insights from sport data.\n<\/p>\n<p>There has been a variety of labels used to characterise processes that extract of useful information from data. These include &#8220;data mining, knowledge extraction, information discovery, information harvesting, data archaeology, and data pattern processing&#8221;<sup id=\"cite_ref-1\" class=\"reference\"><a href=\"#cite_note-1\">[1]<\/a><\/sup>.\n<\/p>\n<p>Gregory Piatetsky-Shapiro <sup id=\"cite_ref-2\" class=\"reference\"><a href=\"#cite_note-2\">[2]<\/a><\/sup> introduced the term &#8220;knowledge discovery&#8221; in a report of a workshop in 1989 that brought together practitioners from &#8220;expert systems, machine learning, intelligent databases, knowledge acquisition, case-based reasoning and statistics&#8221;<sup id=\"cite_ref-3\" class=\"reference\"><a href=\"#cite_note-3\">[3]<\/a><\/sup>. The report of the workshop concluded &#8220;knowledge discovery in databases is an idea whose time has come&#8221;<sup id=\"cite_ref-4\" class=\"reference\"><a href=\"#cite_note-4\">[4]<\/a><\/sup>.\n<\/p>\n<p>William Frawley, Gregory Piatetsky-Shapiro, and Christopher Matheus <sup id=\"cite_ref-5\" class=\"reference\"><a href=\"#cite_note-5\">[5]<\/a><\/sup> provided one of the earliest overviews of knowledge discovery in databases in 1992. They defined knowledge discovery in databases (KDD) as:\n<\/p>\n<blockquote><p>\nKnowledge discovery is the nontrivial extraction of implicit, previously unknown, and potentially useful information from data. Given a set of facts (data) F, a language L, and some measure of certainty C, we define a pattern as a statement S in L that describes relationships among a subset Fs of F with a certainty c, such that S is simpler (in some sense) than the enumeration of all facts in Fs. A pattern that is interesting (according to a user-imposed interest measure) and certain enough (again according to the user\u2019s criteria)is called knowledge. The output of a program that monitors the set of facts in a database and produces patterns in this sense is discovered knowledge.<sup id=\"cite_ref-6\" class=\"reference\"><a href=\"#cite_note-6\">[6]<\/a><\/sup><\/p><\/blockquote>\n<p>They added &#8220;Patterns are interesting when they are novel, useful, and non-trivial to compute&#8221;<sup id=\"cite_ref-7\" class=\"reference\"><a href=\"#cite_note-7\">[7]<\/a><\/sup>.\n<\/p>\n<p>In 1996, Usama Fayyad, Gregory Piatetsky-Shapiro, and Padhraic Smyth discussed &#8220;an overview of this emerging field, clarifying how data mining and knowledge discovery in databases are related both to each other and to related fields, such as machine learning, statistics, and databases&#8221;<sup id=\"cite_ref-8\" class=\"reference\"><a href=\"#cite_note-8\">[8]<\/a><\/sup>. Their paper distinguishes KDD from data mining. They note:\n<\/p>\n<blockquote>\n<p>In our view, KDD refers to the overall process of discovering useful knowledge from data, and data mining refers to a particular step in this process. Data mining is the application of specific algorithms for extracting patterns from data<sup id=\"cite_ref-9\" class=\"reference\"><a href=\"#cite_note-9\">[9]<\/a><\/sup>.\n<\/p>\n<\/blockquote>\n<p>They argue that KDD is a process and data mining is a step within that process. The derivation of useful knowledge from data requires:\n<\/p>\n<ul>\n<li> data preparation\n<\/li>\n<li> data selection\n<\/li>\n<li> data cleaning\n<\/li>\n<li> incorporation of appropriate prior knowledge\n<\/li>\n<li> proper interpretation of the results of data mining<sup id=\"cite_ref-10\" class=\"reference\"><a href=\"#cite_note-10\">[10]<\/a><\/sup>\n<\/li>\n<\/ul>\n<p>Usama Fayyad, Gregory Piatetsky-Shapiro, and Padhraic Smyth provide the conceptual and practical foundation for the the KDD process in sport contexts. They propose:\n<\/p>\n<blockquote><p>\nKDD focuses on the overall process of knowledge discovery from data, including how the data are stored and accessed, how algorithms can be scaled to massive data sets and still run efficiently, how results can be interpreted and visualized, and how the overall man-machine interaction can usefully be modeled and supported<sup id=\"cite_ref-11\" class=\"reference\"><a href=\"#cite_note-11\">[11]<\/a><\/sup>.<\/p><\/blockquote>\n<p>Twenty years after the publication of their paper there is still a tendency to regard data mining and KDD as interchangeable terms. During this unit we have used the term analytics as a shorthand for KDD.\n<\/p>\n<p>Our discussion of analytics used this <a rel=\"nofollow\" class=\"external text\" href=\"https:\/\/sites.google.com\/site\/ucsportinformaticsandanalytics\/analytics\">definition<\/a>:\n<\/p>\n<blockquote><p>\nThe discovery, communication, and implementation of actionable insights derived from structured information in order to improve the quality of decisions and performance in an organization.<\/p><\/blockquote>\n<p>As we develop our KDD skills this activity will include unstructured data too. Whatever is included, it will be part of a process that the literature of the 1990s foresaw.\n<\/p>\n<h2><span class=\"mw-headline\" id=\"References\">References<\/span><\/h2>\n<ol class=\"references\">\n<li id=\"cite_note-1\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-1\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Fayyad, Usama; Piatetsky-Shapiro, Gregory; Smyth, Padhraic (1996). <a rel=\"nofollow\" class=\"external text\" href=\"http:\/\/www.aaai.org\/ojs\/index.php\/aimagazine\/article\/download\/1230\/1131\/\">&#8220;From Data Mining to Knowledge Discovery in Databases&#8221;<\/a>. <i>AI Magazine<\/i> <b>17<\/b> (3):  39<span class=\"printonly\">. <a rel=\"nofollow\" class=\"external free\" href=\"http:\/\/www.aaai.org\/ojs\/index.php\/aimagazine\/article\/download\/1230\/1131\/\">http:\/\/www.aaai.org\/ojs\/index.php\/aimagazine\/article\/download\/1230\/1131\/<\/a><\/span>.<\/span><\/span>\n<\/li>\n<li id=\"cite_note-2\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-2\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Piatetsky-Shapiro, Gregory (1990). [<a rel=\"nofollow\" class=\"external free\" href=\"https:\/\/www.aaai.org\/ojs\/index.php\/aimagazine\/article\/download\/873\/791\">https:\/\/www.aaai.org\/ojs\/index.php\/aimagazine\/article\/download\/873\/791<\/a> &#8220;Knowledge Discovery<br \/>\nin Real Databases: A Report on the IJCAI-89 Workshop&#8221;]. <i>AI Magazine<\/i> <b>11<\/b> (5):  68-70<span class=\"printonly\">. <a rel=\"nofollow\" class=\"external free\" href=\"https:\/\/www.aaai.org\/ojs\/index.php\/aimagazine\/article\/download\/873\/791\">https:\/\/www.aaai.org\/ojs\/index.php\/aimagazine\/article\/download\/873\/791<\/a><\/span>.<\/span><\/span>\n<\/li>\n<li id=\"cite_note-3\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-3\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Piatetsky-Shapiro, Gregory (1990). <a rel=\"nofollow\" class=\"external text\" href=\"https:\/\/www.aaai.org\/ojs\/index.php\/aimagazine\/article\/download\/873\/791\">&#8220;Knowledge Discovery in Real Databases: A Report on the IJCAI-89 Workshop&#8221;<\/a>. <i>AI Magazine<\/i> <b>11<\/b> (5):  68<span class=\"printonly\">. <a rel=\"nofollow\" class=\"external free\" href=\"https:\/\/www.aaai.org\/ojs\/index.php\/aimagazine\/article\/download\/873\/791\">https:\/\/www.aaai.org\/ojs\/index.php\/aimagazine\/article\/download\/873\/791<\/a><\/span>.<\/span><\/span>\n<\/li>\n<li id=\"cite_note-4\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-4\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Piatetsky-Shapiro, Gregory (1990). <a rel=\"nofollow\" class=\"external text\" href=\"https:\/\/www.aaai.org\/ojs\/index.php\/aimagazine\/article\/download\/873\/791\">&#8220;Knowledge Discovery in Real Databases: A Report on the IJCAI-89 Workshop&#8221;<\/a>. <i>AI Magazine<\/i> <b>11<\/b> (5):  70<span class=\"printonly\">. <a rel=\"nofollow\" class=\"external free\" href=\"https:\/\/www.aaai.org\/ojs\/index.php\/aimagazine\/article\/download\/873\/791\">https:\/\/www.aaai.org\/ojs\/index.php\/aimagazine\/article\/download\/873\/791<\/a><\/span>.<\/span><\/span>\n<\/li>\n<li id=\"cite_note-5\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-5\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Frawley, William; Piatetsky-Shapiro, Gregory; Matheus, Christopher (1992). <a rel=\"nofollow\" class=\"external text\" href=\"http:\/\/www.aaai.org\/ojs\/index.php\/aimagazine\/article\/viewFile\/1011\/929\">&#8220;Knowledge Discovery in Databases: An Overview&#8221;<\/a>. <i>AI Magazine<\/i> <b>13<\/b> (3):  57-70<span class=\"printonly\">. <a rel=\"nofollow\" class=\"external free\" href=\"http:\/\/www.aaai.org\/ojs\/index.php\/aimagazine\/article\/viewFile\/1011\/929\">http:\/\/www.aaai.org\/ojs\/index.php\/aimagazine\/article\/viewFile\/1011\/929<\/a><\/span>.<\/span><\/span>\n<\/li>\n<li id=\"cite_note-6\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-6\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Frawley, William; Piatetsky-Shapiro, Gregory; Matheus, Christopher (1992). <a rel=\"nofollow\" class=\"external text\" href=\"http:\/\/www.aaai.org\/ojs\/index.php\/aimagazine\/article\/viewFile\/1011\/929\">&#8220;Knowledge Discovery in Databases: An Overview&#8221;<\/a>. <i>AI Magazine<\/i> <b>13<\/b> (3):  58<span class=\"printonly\">. <a rel=\"nofollow\" class=\"external free\" href=\"http:\/\/www.aaai.org\/ojs\/index.php\/aimagazine\/article\/viewFile\/1011\/929\">http:\/\/www.aaai.org\/ojs\/index.php\/aimagazine\/article\/viewFile\/1011\/929<\/a><\/span>.<\/span><\/span>\n<\/li>\n<li id=\"cite_note-7\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-7\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Frawley, William; Piatetsky-Shapiro, Gregory; Matheus, Christopher (1992). <a rel=\"nofollow\" class=\"external text\" href=\"http:\/\/www.aaai.org\/ojs\/index.php\/aimagazine\/article\/viewFile\/1011\/929\">&#8220;Knowledge Discovery in Databases: An Overview&#8221;<\/a>. <i>AI Magazine<\/i> <b>13<\/b> (3):  58<span class=\"printonly\">. <a rel=\"nofollow\" class=\"external free\" href=\"http:\/\/www.aaai.org\/ojs\/index.php\/aimagazine\/article\/viewFile\/1011\/929\">http:\/\/www.aaai.org\/ojs\/index.php\/aimagazine\/article\/viewFile\/1011\/929<\/a><\/span>.<\/span><\/span>\n<\/li>\n<li id=\"cite_note-8\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-8\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Fayyad, Usama; Piatetsky-Shapiro, Gregory; Smyth, Padhraic (1996). <a rel=\"nofollow\" class=\"external text\" href=\"http:\/\/www.aaai.org\/ojs\/index.php\/aimagazine\/article\/download\/1230\/1131\/\">&#8220;From Data Mining to Knowledge Discovery in Databases&#8221;<\/a>. <i>AI Magazine<\/i> <b>17<\/b> (3):  37-54<span class=\"printonly\">. <a rel=\"nofollow\" class=\"external free\" href=\"http:\/\/www.aaai.org\/ojs\/index.php\/aimagazine\/article\/download\/1230\/1131\/\">http:\/\/www.aaai.org\/ojs\/index.php\/aimagazine\/article\/download\/1230\/1131\/<\/a><\/span>.<\/span><\/span>\n<\/li>\n<li id=\"cite_note-9\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-9\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Fayyad, Usama; Piatetsky-Shapiro, Gregory; Smyth, Padhraic (1996). <a rel=\"nofollow\" class=\"external text\" href=\"http:\/\/www.aaai.org\/ojs\/index.php\/aimagazine\/article\/download\/1230\/1131\/\">&#8220;From Data Mining to Knowledge Discovery in Databases&#8221;<\/a>. <i>AI Magazine<\/i> <b>17<\/b> (3):  39<span class=\"printonly\">. <a rel=\"nofollow\" class=\"external free\" href=\"http:\/\/www.aaai.org\/ojs\/index.php\/aimagazine\/article\/download\/1230\/1131\/\">http:\/\/www.aaai.org\/ojs\/index.php\/aimagazine\/article\/download\/1230\/1131\/<\/a><\/span>.<\/span><\/span>\n<\/li>\n<li id=\"cite_note-10\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-10\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Fayyad, Usama; Piatetsky-Shapiro, Gregory; Smyth, Padhraic (1996). <a rel=\"nofollow\" class=\"external text\" href=\"http:\/\/www.aaai.org\/ojs\/index.php\/aimagazine\/article\/download\/1230\/1131\/\">&#8220;From Data Mining to Knowledge Discovery in Databases&#8221;<\/a>. <i>AI Magazine<\/i> <b>17<\/b> (3):  39<span class=\"printonly\">. <a rel=\"nofollow\" class=\"external free\" href=\"http:\/\/www.aaai.org\/ojs\/index.php\/aimagazine\/article\/download\/1230\/1131\/\">http:\/\/www.aaai.org\/ojs\/index.php\/aimagazine\/article\/download\/1230\/1131\/<\/a><\/span>.<\/span><\/span>\n<\/li>\n<li id=\"cite_note-11\"><span class=\"mw-cite-backlink\"><a href=\"#cite_ref-11\">\u2191<\/a><\/span> <span class=\"reference-text\"><span class=\"citation Journal\">Fayyad, Usama; Piatetsky-Shapiro, Gregory; Smyth, Padhraic (1996). <a rel=\"nofollow\" class=\"external text\" href=\"http:\/\/www.aaai.org\/ojs\/index.php\/aimagazine\/article\/download\/1230\/1131\/\">&#8220;From Data Mining to Knowledge Discovery in Databases&#8221;<\/a>. <i>AI Magazine<\/i> <b>17<\/b> (3):  39ff<span class=\"printonly\">. <a rel=\"nofollow\" class=\"external free\" href=\"http:\/\/www.aaai.org\/ojs\/index.php\/aimagazine\/article\/download\/1230\/1131\/\">http:\/\/www.aaai.org\/ojs\/index.php\/aimagazine\/article\/download\/1230\/1131\/<\/a><\/span>.<\/span><\/span>\n<\/li>\n<\/ol>\n<p><!-- \nNewPP limit report\nCPU time usage: 0.350 seconds\nReal time usage: 0.350 seconds\nPreprocessor visited node count: 6538\/1000000\nPreprocessor generated node count: 13083\/1000000\nPost\u2010expand include size: 37919\/2097152 bytes\nTemplate argument size: 13487\/2097152 bytes\nHighest expansion depth: 13\/40\nExpensive parser function count: 0\/100\n--><\/p>\n<p><!-- Saved in parser cache with key wikiedu-mw_:pcache:idhash:174944-0!*!*!!*!*!* and timestamp 20160322021239 and revision id 991637\n -->\n<\/div>\n<div class=\"visualClear\"><\/div>\n<\/p><\/div>\n<\/p><\/div>\n<\/div>\n<\/div>\n<\/div>\n<div class=\"row\">\n<div class=\"col-md-12\">\n<ul class=\"pager\">\n<li class=\"previous\">\n            <a href=\"\/sia\/course-content\/communities-of-practice\/thriving-communities\">\u2190 Previous<\/a>\n          <\/li>\n<li class=\"next\">\n            <a href=\"\/sia\/course-content\/knowledge-discovery\/introduction\">Next \u2192<\/a>\n          <\/li>\n<\/ul><\/div>\n<\/p><\/div>\n<\/div>\n<footer>\n<br \/>\n<\/footer>\n","protected":false},"excerpt":{"rendered":"<p>Introduction This topic explores how we can extract useful information and actionable insights from sport data. There has been a variety of labels used to characterise processes that extract of useful information from data. These include &#8220;data mining, knowledge extraction, information discovery, information harvesting, data archaeology, and data pattern processing&#8221;[1]. Gregory Piatetsky-Shapiro [2] introduced the [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"parent":502,"menu_order":5800,"comment_status":"closed","ping_status":"closed","template":"","meta":{"footnotes":""},"class_list":["post-602","page","type-page","status-publish","hentry"],"_links":{"self":[{"href":"https:\/\/course.oeru.org\/sia\/wp-json\/wp\/v2\/pages\/602","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/course.oeru.org\/sia\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/course.oeru.org\/sia\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/course.oeru.org\/sia\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/course.oeru.org\/sia\/wp-json\/wp\/v2\/comments?post=602"}],"version-history":[{"count":1,"href":"https:\/\/course.oeru.org\/sia\/wp-json\/wp\/v2\/pages\/602\/revisions"}],"predecessor-version":[{"id":603,"href":"https:\/\/course.oeru.org\/sia\/wp-json\/wp\/v2\/pages\/602\/revisions\/603"}],"up":[{"embeddable":true,"href":"https:\/\/course.oeru.org\/sia\/wp-json\/wp\/v2\/pages\/502"}],"wp:attachment":[{"href":"https:\/\/course.oeru.org\/sia\/wp-json\/wp\/v2\/media?parent=602"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}