Обсуждение:Численные методы обучения по прецедентам (практика, В.В. Стрижов)

Материал из MachineLearning.

(Различия между версиями)
Перейти к: навигация, поиск
м
Строка 1: Строка 1:
Основная страница будет постепенно переводиться на английский в связи с значимым числом запросов. 9.2.2014
Основная страница будет постепенно переводиться на английский в связи с значимым числом запросов. 9.2.2014
 +
{{TOCright}}
 +
 +
== Introduction ==
 +
’’Machine Learning and Data Analysis’’ is a practical course that focuses on methods for scientific research. The course teaches students how to conduct research projects in the field of machine learning and data analysis. The '''abstract goal''' is to learn to convey ideas in precise, clear and elegant way; '''specific goal''' is to write a research paper, accepted by other researchers from the field of Machine Learning and Data Analysis; make a report.
 +
'''Expected result''' is a research paper, submitted to a peer-reviewed journal from the list, composed by the [https://en.wikipedia.org/wiki/Higher_Attestation_Commission Higher Attestation Commission].
 +
 +
The course introduces students to technologies used in scientific research and teaches them to present the results of their studies in the correct format, as used by other researchers from the field of machine learning and data analysis. By the end of this term, each student is expected to write a research paper and submit it to a peer-reviewed journal from the list, composed by the Higher Attestation Commission. During the course the students learn the basics of scientific writing and designing computational experiments, using associated tools such as markdown system LaTeX, bibliographic system BibTeX, and computing environment MATLAB.
 +
 +
The work on a project includes exploring the literature, writing mathematical problem statement and algorithm description, investigating the its properties, and running computational experiments. Each student selects a personal problem from the list of suggested research topics. The student analyzes recent publications on the selected topic, formulates the problem and presents it to the group. Then the student performs mathematical description and analysis of suggested methods, followed by an intermediate report. The last step is to run computational experiments to illustrate the method's properties using real or synthetic data. Each paper undergoes a revision process with the student's peers acting as reviewers. The works are syncronized via [[SourceForge|SourceForge.org]], at the project ’’[[MLAlgorithms]]’’.
 +
 +
'''Course format'''. Each project is aided by an assistant and an expert.
 +
A '''student''' is willing to learn to formally state research problems, find adequate references, generate novel and significant ideas for problem solving.
 +
 +
An '''assistant''' helps the student with technical issues, consults the student on topics of machine learning, promptly reacts to arising problems, performs evaluations and grading. Each assistant is supposed to possess sufficient publishing experience. Ideally, the advisor is writing paper on the adjacent topic. It is recommended to organize weekly reviewing process in such way that a student would input the corrections himself.
 +
 +
An '''expert''' guarantees novelty and importance of the paper, suggests the problems, provides data.
 +
 +
 +
== Course-related materials ==
 +
* Brief description of the course: goals, structure and grading policy [http://svn.code.sf.net/p/mlalgorithms/code/MLEducation/Strijov2014MLCourseShort_eng.pdf?format=raw CourseShort.pdf]
 +
* [http://www.ccas.ru/jmlda/bib_refs Библиография всех завершенных проектов (191 проект на декабрь 2014)]
 +
* Slides in PDF with course overview (goals, syllabus, summary of 2009-2014 results) [http://svn.code.sf.net/p/mlalgorithms/code/MLEducation/Strijov2015MLCourseSlides.pdf?format=raw CourseSlides.pdf]
 +
* [[Численные методы обучения по прецедентам (практика, В.В. Стрижов)/Basic schedule | Basic schedule]] with the list of tasks to complete
 +
* Report on the course results, Fall 2013 [http://svn.code.sf.net/p/mlalgorithms/code/MLEducation/Strijov2013Fall-MLDA.pdf?format=raw Report2013Fall.pdf]
 +
* Report presentation templates in [http://sourceforge.net/p/mlalgorithms/code/HEAD/tree/Group074/Kuznetsov2013SSAForecasting/doc/Ivkin2013PresentationSample.pdf?format=raw pdf], [http://sourceforge.net/p/mlalgorithms/code/HEAD/tree/Group074/Kuznetsov2013SSAForecasting/doc/Ivkin2013PresentationSample.tex?format=raw tex]
 +
* Lists of recommended journals on Machine Learning and Data Analysis:
 +
** High impact factor [http://svn.code.sf.net/p/mvr/code/lectures/4thYear/Lecture1/2_High_IF_ScientificJournals High_IF_ScientificJournals.pdf]
 +
** Low impact factor [http://svn.code.sf.net/p/mvr/code/lectures/4thYear/Lecture1/3_Low_IF_ScientificJournals Low_IF_ScientificJournals.pdf]
 +
* On reviewing/resubmitting/correcting the paper:
 +
** Examples of feedback from reviewers: [http://svn.code.sf.net/p/mvr/code/lectures/4thYear/Lecture1/r1_Reviewer_PCA_NN Review1.pdf], [http://svn.code.sf.net/p/mvr/code/lectures/4thYear/Lecture1/r2_Reviewer_Belsley Review2.pdf], [http://svn.code.sf.net/p/mvr/code/lectures/4thYear/Lecture1/r3_Review_Ranking Review3.pdf]
 +
** Sample responses [http://svn.code.sf.net/p/mvr/code/lectures/4thYear/Lecture1/r4_Reviewer_Gait Response1.pdf], [http://svn.code.sf.net/p/mvr/code/lectures/4thYear/Lecture1/r5_Reviewer_Report_on_the_manuscript_correction Response2.pdf]
 +
** Correction sample [http://svn.code.sf.net/p/mvr/code/lectures/4thYear/Lecture1/r6_CorrectionSample CorrectedPaper.pdf]
 +
 +
== Past terms ==
 +
{|class="wikitable"
 +
|-
 +
! Link to the course page
 +
! Description
 +
|-
 +
| [[Численные методы обучения по прецедентам (практика, В.В. Стрижов)/Группа 274, весна 2015 | Group 274, summer 2015 (In Russian)]]
 +
| My first publication in Higher Attestation Commission journal. The course involves experts and personal assistants.
 +
|-
 +
| [[Численные методы обучения по прецедентам (практика, В.В. Стрижов)/Группа YАД, весна 2015 | Group YАД, summer 2015 (In Russian)]]
 +
| My first publication in Higher Attestation Commission journal. The course involves experts and personal assistants.
 +
|-
 +
| [[Численные методы обучения по прецедентам (практика, В.В. Стрижов)/Группа 174, весна 2015 | Group 174, summer 2015 (In Russian)]]
 +
| Research planning.
 +
|-
 +
| [[Численные методы обучения по прецедентам (практика, В.В. Стрижов)/Группа 174, осень 2014 | Group 174, winter 2014 (In Russian)]]
 +
| Conducting commercially-oriented research, developing applications. The problems are chosen from industrial and academical sources.
 +
|-
 +
| [[Численные методы обучения по прецедентам (практика, В.В. Стрижов)/Группа 974, осень 2014 | Group 974, winter 2014 (In Russian)]]
 +
| Lectures on emerging machine learning issues. Assays and practice in Mathematica.
 +
|-
 +
| [[Численные методы обучения по прецедентам (практика, В.В. Стрижов)/Группа 174, весна 2014 | Group 174, summer 2014 (In Russian)]]
 +
| My first publication in Higher Attestation Commission journal. The course involves experts and personal assistants.
 +
|-
 +
| [[Численные методы обучения по прецедентам (практика, В.В. Стрижов)/Группа 074, весна 2014 | Group 074, summer 2014 (In Russian)]]
 +
| Writing assays: brief problem statements and analysis
 +
|-
 +
| [[Численные методы обучения по прецедентам (практика, В.В. Стрижов)/Группа 974, весна 2014 | Group 974, summer 2014 (In Russian)]]
 +
| The "Software engineering" course, professor L. Karpov
 +
|-
 +
|}
 +
 +
== Requirements ==
 +
'''Basic'''
 +
* The students must have previously passed the analysis, discrete mathematics, probability theory, statistical inference, and optimization algorithms courses.
 +
 +
'''Advanced'''
 +
* The students are encouraged to get acquainted with materials of [http://shad.yandex.ru/lectures/machine_learning.xml the lecture course on machine learning by K. Vorontsov].
 +
 +
== Approximate syllabus ==
 +
# Find and describe the data. Compose a reference list, and store it in bib-file. Write an annotation to the paper.
 +
# Visualize the data. Make a literature review.
 +
# Write an introduction to the paper. The introduction should include existing methods review and a description of the proposed approach.
 +
# Write a problem statement. Make stress on the novelty of suggested approach. Come up with a solution draft.
 +
# Design computational experiment, obtain initial results.
 +
# Describe the suggested approach in detail.
 +
# Complete computational experiments.
 +
# Describe the results of computational experiments. This includes error analysis and comparison to other methods.
 +
# Correct the paper according to reviewers comments.
 +
# Correct theoretical content.
 +
# Correct the paper's structure.
 +
# Submit the manuscript of the paper to a journal.
 +
# Make a report
 +
 +
<!--
 +
== Report ==
 +
Отчет состоит из следующих материалов:
 +
# научная статья,
 +
# исходный код алгоритма,
 +
# рецензия на работу,
 +
# доклад и презентация.
 +
!-->
 +
== Consulting and grading ==
 +
# The project is divided into separate tasks, each followed by a list of requirements that determine the quality criteria for grading.
 +
# Each task must be completed during the week and submitted the day preceding the lecture.
 +
# Preferably, each task is improved and resubmitted several times before the deadline.
 +
 +
Each completed task (marked with a corresponding letter) yields 1 point, and the suffix +/-
 +
adds/subtracts 0.25 points.
 +
 +
== Homeworks ==
 +
''Note for assistants''. The tasks listed below provide quality citeria for homework grading.
 +
=== Homework1: synchronization tools ===
 +
# Acquire the technical computing environment ([http://www.mathworks.com/ MATLAB] or [http://www.gnu.org/software/octave/ Octave]) .
 +
# Install the typesetting system TeX ([http://miktex.org MikTeX] for Windows, [http://www.tug.org/texlive/ TeX Live] for Linux and Mac OS).
 +
# Install a text editor, for example [http://www.texniccenter.org/ TeXnic Center] or [http://www.winedt.com/ WinEdt] for Windows, and [http://www.tug.org/texworks/ TeXworks] for Linux.
 +
# Install the bibliographic reference manager [http://jabref.sourceforge.net/ JabRef].
 +
# Create account at [http://sourceforge.net/] repository and e-mail the login to the group's coordinator. Read [introductory materials] on version control systems.
 +
# Install a subversion client ([http://tortoisesvn.net/downloads.html TortoiseSVN] for Windows, [http://rabbitvcs.org/ RabbitVCS] for Linux).
 +
# Following the [[SourceForge| guidelines]], check out the [https://sourceforge.net/projects/mlalgorithms/ MLAlgorithms] [https://svn.code.sf.net/p/mlalgorithms/code repository].
 +
# Create account at [http://www.machinelearning.ru/wiki/index.php?title=%D0%A1%D0%BB%D1%83%D0%B6%D0%B5%D0%B1%D0%BD%D0%B0%D1%8F:Userlogin&type=signup&returnto=%D0%97%D0%B0%D0%B3%D0%BB%D0%B0%D0%B2%D0%BD%D0%B0%D1%8F_%D1%81%D1%82%D1%80%D0%B0%D0%BD%D0%B8%D1%86%D0%B0 MachineLearning.ru] and e-mail the login to the group's coordinator.
 +
 +
Run the installed tools, and get acquainted with interfaces.
 +
 +
=== Homework1: LaTeX ===
 +
# If necessary, read [https://en.wikipedia.org/wiki/LaTeX LaTeX] and [http://en.wikipedia.org/wiki/Bibtex BibTeX] articles.
 +
# Download the [[Media:jmlda-guides.zip‎‎|article template, ZIP]] and compile it.
 +
* [http://liinwww.ira.uka.de/csbib?strijov%20nonlinear Bibliographic base example]
 +
* [http://liinwww.ira.uka.de/cgi-bin/bibshow?e=Njtd0ECMQ03121/fyqboefe%7d81352582&r=bibtex&mode=intra Bibliographic refrence example]
 +
* [http://en.wikipedia.org/wiki/List_of_academic_databases_and_search_engines List of bibliographic bases]
 +
 +
=== Homework1: MATLAB ===
 +
# Read [introductory materials] to MATLAB.
 +
# Read documenting conventions [[Media: MatlabStyle1p5.pdf|Matlab Programming Style Guidelines]].

Версия 13:06, 10 июня 2015

Основная страница будет постепенно переводиться на английский в связи с значимым числом запросов. 9.2.2014

Содержание

Introduction

’’Machine Learning and Data Analysis’’ is a practical course that focuses on methods for scientific research. The course teaches students how to conduct research projects in the field of machine learning and data analysis. The abstract goal is to learn to convey ideas in precise, clear and elegant way; specific goal is to write a research paper, accepted by other researchers from the field of Machine Learning and Data Analysis; make a report. Expected result is a research paper, submitted to a peer-reviewed journal from the list, composed by the Higher Attestation Commission.

The course introduces students to technologies used in scientific research and teaches them to present the results of their studies in the correct format, as used by other researchers from the field of machine learning and data analysis. By the end of this term, each student is expected to write a research paper and submit it to a peer-reviewed journal from the list, composed by the Higher Attestation Commission. During the course the students learn the basics of scientific writing and designing computational experiments, using associated tools such as markdown system LaTeX, bibliographic system BibTeX, and computing environment MATLAB.

The work on a project includes exploring the literature, writing mathematical problem statement and algorithm description, investigating the its properties, and running computational experiments. Each student selects a personal problem from the list of suggested research topics. The student analyzes recent publications on the selected topic, formulates the problem and presents it to the group. Then the student performs mathematical description and analysis of suggested methods, followed by an intermediate report. The last step is to run computational experiments to illustrate the method's properties using real or synthetic data. Each paper undergoes a revision process with the student's peers acting as reviewers. The works are syncronized via SourceForge.org, at the project ’’MLAlgorithms’’.

Course format. Each project is aided by an assistant and an expert. A student is willing to learn to formally state research problems, find adequate references, generate novel and significant ideas for problem solving.

An assistant helps the student with technical issues, consults the student on topics of machine learning, promptly reacts to arising problems, performs evaluations and grading. Each assistant is supposed to possess sufficient publishing experience. Ideally, the advisor is writing paper on the adjacent topic. It is recommended to organize weekly reviewing process in such way that a student would input the corrections himself.

An expert guarantees novelty and importance of the paper, suggests the problems, provides data.


Course-related materials

Past terms

Link to the course page Description
Group 274, summer 2015 (In Russian) My first publication in Higher Attestation Commission journal. The course involves experts and personal assistants.
Group YАД, summer 2015 (In Russian) My first publication in Higher Attestation Commission journal. The course involves experts and personal assistants.
Group 174, summer 2015 (In Russian) Research planning.
Group 174, winter 2014 (In Russian) Conducting commercially-oriented research, developing applications. The problems are chosen from industrial and academical sources.
Group 974, winter 2014 (In Russian) Lectures on emerging machine learning issues. Assays and practice in Mathematica.
Group 174, summer 2014 (In Russian) My first publication in Higher Attestation Commission journal. The course involves experts and personal assistants.
Group 074, summer 2014 (In Russian) Writing assays: brief problem statements and analysis
Group 974, summer 2014 (In Russian) The "Software engineering" course, professor L. Karpov

Requirements

Basic

  • The students must have previously passed the analysis, discrete mathematics, probability theory, statistical inference, and optimization algorithms courses.

Advanced

Approximate syllabus

  1. Find and describe the data. Compose a reference list, and store it in bib-file. Write an annotation to the paper.
  2. Visualize the data. Make a literature review.
  3. Write an introduction to the paper. The introduction should include existing methods review and a description of the proposed approach.
  4. Write a problem statement. Make stress on the novelty of suggested approach. Come up with a solution draft.
  5. Design computational experiment, obtain initial results.
  6. Describe the suggested approach in detail.
  7. Complete computational experiments.
  8. Describe the results of computational experiments. This includes error analysis and comparison to other methods.
  9. Correct the paper according to reviewers comments.
  10. Correct theoretical content.
  11. Correct the paper's structure.
  12. Submit the manuscript of the paper to a journal.
  13. Make a report

Consulting and grading

  1. The project is divided into separate tasks, each followed by a list of requirements that determine the quality criteria for grading.
  2. Each task must be completed during the week and submitted the day preceding the lecture.
  3. Preferably, each task is improved and resubmitted several times before the deadline.

Each completed task (marked with a corresponding letter) yields 1 point, and the suffix +/- adds/subtracts 0.25 points.

Homeworks

Note for assistants. The tasks listed below provide quality citeria for homework grading.

Homework1: synchronization tools

  1. Acquire the technical computing environment (MATLAB or Octave) .
  2. Install the typesetting system TeX (MikTeX for Windows, TeX Live for Linux and Mac OS).
  3. Install a text editor, for example TeXnic Center or WinEdt for Windows, and TeXworks for Linux.
  4. Install the bibliographic reference manager JabRef.
  5. Create account at [1] repository and e-mail the login to the group's coordinator. Read [introductory materials] on version control systems.
  6. Install a subversion client (TortoiseSVN for Windows, RabbitVCS for Linux).
  7. Following the guidelines, check out the MLAlgorithms repository.
  8. Create account at MachineLearning.ru and e-mail the login to the group's coordinator.

Run the installed tools, and get acquainted with interfaces.

Homework1: LaTeX

  1. If necessary, read LaTeX and BibTeX articles.
  2. Download the article template, ZIP and compile it.

Homework1: MATLAB

  1. Read [introductory materials] to MATLAB.
  2. Read documenting conventions Matlab Programming Style Guidelines.
Личные инструменты