## What are some really interesting machine learning projects for beginners – quora tgask

• Applying the Viterbi algorithm to build a tool to split Chinese strings in to words. This is actually quite a hard problem – unlike in English where there are spaces to separate words, there are no spaces in Chinese, and what people consider to comprise “word units” vary because of issues like ambiguity, contractions, specialized terms, etc. It’s probably not the most straightforward application of the algorithm, but it was interesting nevertheless. You can read over the example in Wikipedia for more information: Viterbi algorithm – Wikipedia

• Using CLIPS to build an expert system. The example we were shown in class was a program that could predict what animal you think based on a series of questions it asks and your answers to them. An expert system typically tries to emulate how human experts make decisions, based on a knowledge base (basic facts and rules known to the system), and an inference engine (where the system deduces new facts). It was quite an interesting application. You can download CLIPS here: A Tool for Building Expert Systems, but I would suggest going through the User’s Guide and the Basic programming guide here Documentation | CLIPS first to gain more intuition as to what it is.

• Classification: Data is labelled meaning it is assigned a class, for example spam/non-spam or fraud/non-fraud. The decision being modeled is to assign labels to new unlabeled pieces of data. This can be thought of as a discrimination problem, modelling the differences or similarities between groups.

• Regression: Data is labelled with a real value (think floating point) rather then a label. Examples that are easy to understand are time series data like the price of a stock over time, The decision being modeled is what value to predict for new unpredictable data.

• Clustering: Data is not labelled, but can be divided into groups based on similarity and other measures of natural structure in the data. An example from the above list would be organizing pictures by faces without names, where the human user has to assign names to groups, like iPhoto on the Mac.

• Rule Extraction: Data is used as the basis for the extraction of propositional rules (antecedent/consequent aka if-then). Such rules may, but are typically not directed, meaning that the methods discover statistically supportable relationships between attributes in the data, not necessarily involving something that is being predicted. An example is the discovery of the relationship between the purchase of beer and diapers (this is data mining folk-law, true or not, it’s illustrative of the desire and opportunity).

A lot of people have given some amazing answers. Prof. Andrej Karpathy has rightly talked about the importance of implementing the basics while learning and working on proofs instead of a passive reading. In my own experience, solving small problems and then moving on to bigger ones is the right way to proceed in any domain. While programming, you first learn the basics of control statements and syntax, before moving on to learning about databases and creating a full-stack web and mobile application.

Similarly, if you want to work on a really interesting and useful machine learning project, you must first build a strong base, learn about the different components and their place in the overall picture, identify a problem area and then begin developing a solution for it with your knowledge. In this way, you will not only solve a problem, but also learn a lot of new things in the process.

There can be a variety of ways to achieve this and the internet today is full of reading and video resources that can help you in your journey. I personally had to go through a lot of such resources and do a lot of my own research and projects before I could be useful to anyone in a professional capacity and worthy of being paid for doing my job. I heavily made use of a lot of MIT OCW courses, Kaggle competitions, Stanford and CMU’s YouTube lectures and personal research in my journey (I still go through them today).

A few months ago, I had come across a great set of Nano-degrees from Udacity that seem to have been meticulously designed and organised. The great aspect that I found about them was that they list all the pre-requisites for a immersive experience rather than just rushing through the concepts. All the pre-requisites can be completed through the free Udacity courses. This successfully takes care of one of the major parts missing from a number of other resources, where they simply assume the proficiency and do not tell how to fulfill them.

Deep Learning ND: Generate TV scripts using Recurrent Neural Networks; Generate faces with Generative Adversarial Networks; Design a deep reinforcement learning agent to control several quadcopter flying tasks, including take-off, hover, and landing, etc.

The fact that some of the pioneers, academicians and experts in this field, such as Sebastian Thrun, Ian Goodfellow, Andrew Trask, etc. are involved in the design of the course makes these nano-degrees worth it in my opinion. The mentor and peer support, along with a dedicated career cell, are additional advantages.