No Black Box Machine Learning Course – Learn Without Libraries

freeCodeCamp.org135 minutes read

The course teaches machine learning in JavaScript without libraries, covering data collection, feature extraction, classifiers, and building a drawing app, transitioning from software engineering to machine learning. It instructs on creating tools for data collection, processing data, visualizing features, and implementing a K nearest neighbors classifier, emphasizing importance of data scaling techniques and evaluation of classifier accuracy.

Insights

  • Course teaches machine learning in JavaScript without libraries, focusing on software development skills initially and transitioning to machine learning concepts.
  • Data collection, processing, and visualization are emphasized, with features like drawing app creation, feature extraction, and classifier implementation covered.
  • Implementation of advanced machine learning concepts like K nearest neighbors classifier and decision boundary computation for accurate data classification.
  • Integration with Python for quicker completion and evaluation of machine learning models, with a focus on K Nearest Neighbors (KNN) implementation and accuracy assessment.

Get key ideas from YouTube videos. It’s free

Recent questions

  • What does the course teach?

    Machine learning in JavaScript without libraries.

  • Who is the instructor?

    Radu, a Computer Science PhD and university lecturer.

  • What is emphasized in data collection?

    Importance of processing and visualizing collected data.

  • What is the goal of the course?

    Enhancing software development skills.

  • How is data saved locally?

    Using a function named "save" to save files.

Related videos

Summary

00:00

JavaScript Machine Learning Course by Radu

  • Course teaches machine learning in JavaScript without libraries, enhancing software development skills.
  • Taught by Radu, a Computer Science PhD and university lecturer.
  • Covers building a drawing app, working with data, feature extraction, and classifiers.
  • Focuses on software engineering initially, transitioning to machine learning.
  • Teaches data collection tool creation using a drawing app for desktop and mobile.
  • Instructs on processing and visualizing collected data, emphasizing importance.
  • Covers feature extraction and visualization, building a chart component from scratch.
  • Introduces nearest neighbor classifier implementation and data scaling importance.
  • Implements advanced K nearest neighbors classifier and decision boundary computation.
  • Offers a review lesson in Python, using libraries for quicker completion.

19:16

Drawing utility for touch devices with undo.

  • The function "path" allows specifying the context and path to draw.
  • A draw utility object named "draw" is created in a separate file called draw.js.
  • The object "draw" includes a path function with parameters for context, path, and a default color of black.
  • The path function sets the stroke style to black and draws a line with a width of three.
  • The move to function spreads an array into its components for drawing points.
  • Multiple paths are drawn by iterating through all paths and drawing each with specified properties.
  • To optimize for mobile devices, a viewport meta tag is added to adjust the width and prevent zooming.
  • Event listeners for touch events are added to ensure drawing functionality on touch devices.
  • An undo button is implemented to remove the last drawn path.
  • Data collection functionality is added to record the student's name, session ID, and drawings of various objects.

41:21

"Save Locally Drawn Paths with Timestamped File"

  • The button "thank you" will change to "save" to save locally drawn paths.
  • The new function "save" will allow saving files on the local computer.
  • Instructions will guide on downloading and placing the file in the dataset.
  • To create and download the file, an element with a href attribute will be used.
  • Data will be converted to a JSON string for saving.
  • JSON is a standard format for storing data, readable and supported by many languages.
  • The file will be named uniquely based on a generated timestamp.
  • The download action will be triggered for the file.
  • The event listener for mouse up will be added to the document for better user experience.
  • A spreadsheet is available for testing the system on different devices and reporting issues.

01:02:22

Fixing module recognition, constants, progress indicator, viewer app.

  • In draw.js, the browser doesn't recognize modules, but this can be fixed by checking if the module type is not undefined.
  • Applying a similar structure to constants by separating them into a new file called constants.js and requiring it in the main file.
  • Implementing a progress indicator in a new file called utils.js by creating a function called print progress with count and max parameters.
  • Exporting the utils object and including it in the data set generation to display a progress indicator.
  • Storing data in a format readable by browsers by creating a samples JavaScript file in a JS objects directory.
  • Creating a viewer app in viewer.html with basic HTML structure, including meta tags, title, external style sheet, and JavaScript files.
  • Grouping samples by student ID using a group by function in utils.js and displaying them in a table format in viewer.html.
  • Creating a function called create row in display.js to display rows with student names and their corresponding samples.
  • Implementing a blur effect for flagged users' drawings by adding a blur class in CSS and applying it to images of flagged users in display.js.

01:25:48

"Drawing Comparison: Traditional vs. Quickdraw Data"

  • The speaker admires various drawings, including a Christmas tree and horses, noting the uniqueness of each.
  • Mention of detailed drawings like Jin-Zon-Ning and Zhong-Nanago, highlighting the time-consuming nature of their creation.
  • Comparison to Quickdraw data, noting the absence of time limits and the presence of an undo button for improved drawing quality.
  • Encouragement for viewers to share their page styling versions for potential showcase in a future video.
  • Introduction of functions in a file named features.js to extract features from samples, starting with a function to count paths.
  • Explanation of a function to count points from paths by flattening arrays and exporting for node usage.
  • Creation of a feature extractor script in a node directory, reading samples, looping through them, and extracting features like path and point counts.
  • Outputting feature data to a features file, with a mention of excluding extra information if necessary.
  • Update of a web application to display feature data using Google charts, with details on setting up options, loading packages, and generating data tables.
  • Implementation of color styling based on labels, with a mention of transparency challenges and a switch to a custom chart library for better control over styling and features.

01:50:23

"Dynamic Chart Features with Color Emphasis"

  • Images for different objects are generated based on specified colors using graphics in utils.
  • Colors of shapes are determined by colors specified in utils.
  • Chart supports a callback function for item selection, implemented as handle click in display.js.
  • handle click function in display.js adds an emphasize class to selected items.
  • Emphasize class in style CSS sets background color to yellow for emphasized items.
  • Scroll to view element with auto behavior added in display.js after emphasizing.
  • Emphasized items are stored in an array and the emphasize class is removed from each item.
  • Chart and content are separated with chart fixed on the side in style CSS.
  • Control panel added to toggle visibility of input container for sketchpad.
  • Dynamic point added to chart to show location of drawn features, moving as drawing progresses.

02:14:35

Enhancing Point Visibility in Data Analysis

  • The goal is to make a specific point more visible on a faded background when refreshing the page.
  • The value of the point needs to be large for visibility when zooming in on a chart.
  • To ensure the point remains visible even when zoomed in and at extreme corners, adjustments are needed.
  • Toggling input visibility should hide the point above the dynamic input section.
  • Implementing a trigger update method in Sketchpad JS helps update the point's visibility.
  • Calculating the width and height of drawings is crucial for structuring data.
  • Adjusting feature functions and the feature extractor to include width and height calculations.
  • Ensuring the dataset features are regenerated to reflect the new width and height values.
  • Addressing outliers in data, like drawings exceeding canvas boundaries, is common and requires attention.
  • Classifying drawings based on extracted features and finding the nearest sample for classification are key steps in the process.

02:38:29

Rectify chart distortion with proper data display.

  • The chart's misleading appearance is due to a square aspect ratio, causing data distortion.
  • To rectify this, calculate the delta on x and y to determine the maximum value for proper chart display.
  • Adjust the chart to eliminate squishing or stretching, ensuring accurate data representation.
  • Feature functions like path count and point count impact data classification differently.
  • Normalize data to ensure equal significance of all features for fair classification.
  • Data scaling techniques like normalization remap values between zero and one for uniform importance.
  • Outlier points can disrupt normalization, prompting consideration of standardization for better results.
  • Implement the k nearest neighbors classifier to determine classification based on majority votes.
  • Modify the nearest neighbor search to return multiple neighbors for improved classification accuracy.
  • Count and identify the majority label among the nearest samples for accurate classification.

03:02:35

"Optimizing K-nearest neighbor classifier accuracy"

  • The process involves passing a sample point through a for loop and saving the file before refreshing the page to draw and analyze lines.
  • Despite the proximity of other objects, the system selects clocks due to their higher quantity.
  • Variants of nearest neighbor classifiers, like distance-weighted versions, exist, and implementing them is encouraged.
  • To evaluate the classifier, data is split into training and testing sets, with correct classifications counted to compute accuracy.
  • The data is split into training and testing sets, with the former comprising 50% of the samples.
  • The training and testing sets are written into files, ensuring no overlap between the two for accurate testing.
  • Normalization is crucial, with data normalized using training set values to avoid discrepancies.
  • The classification process involves classifying all unknown points at once, with labels determined based on the training data.
  • The accuracy of the K-nearest neighbor classifier is calculated, with different values of K tested for optimal results.
  • Decision boundaries are introduced as a more informative method than visual representations like the spider, aiding in understanding the classifier's decisions.

03:29:01

Implementing KNN Classifier for Image Classification

  • KNN classifier is included in the code for classification purposes.
  • The classifier is instantiated using training samples and a specific value for K.
  • The classify function is removed from the sketchpad, streamlining the code.
  • Evaluation is run after printing accuracy, followed by generating a decision boundary plot.
  • A canvas of 100x100 is created using the create canvas function.
  • The plot is pixel-based, with each pixel representing a feature between 0 and 1.
  • Normalized points are created for each pixel to color based on predicted values.
  • The decision boundary plot is saved as a PNG image in the data set directory.
  • The image is set as the background of the chart in the viewer HTML file.
  • JavaScript is used to prepare data for Python by converting samples to CSV format.
  • The feature extractor JavaScript file is modified to output training and testing data in CSV form.
  • Python implementation of K Nearest Neighbors (KNN) is done using libraries.
  • Training and testing data are read from CSV files and parsed into usable formats.
  • K Neighbors Classifier is instantiated with 50 neighbors using brute force algorithm and uniform weights.
  • The accuracy of the model is calculated and printed in the terminal.
  • A function to read feature files is created for reusability.
  • Suggestions are made to explore further with matplotlib for displaying feature values and decision boundaries.
Channel avatarChannel avatarChannel avatarChannel avatarChannel avatar

Try it yourself — It’s free.