felixchenfy/Data-Augment-and-Train-Yolo

Stars
16
Rank 1,311,288 (Top 26 %)
Language
Jupyter Notebook
Created over 5 years ago
Updated over 5 years ago

felixchenfy/Data-Augment-and-Train-Yolo

felixchenfy

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Put masked object onto background images randomly to generate images. Train Yolo3.

Realtime-Action-Recognition

Apply ML to the skeletons from OpenPose; 9 actions; multiple people. (WARNING: I'm sorry that this is only good for course demo, not for real world applications !!! Those ary very difficult !!!)

Monocular-Visual-Odometry

A simple monocular visual odometry (part of vSLAM) by ORB keypoints with initialization, tracking, local map and bundle adjustment. (WARNING: Hi, I'm sorry that this project is tuned for course demo, not for real world applications !!!)

open3d_ros_pointcloud_conversion

2 Python API functions for point cloud conversion between Open3D and ROS. Compatible for XYZ and XYZRGB point type.

ros_yolo_as_template_matching

Run 3 scripts to (1) Synthesize images (by putting few template images onto backgrounds), (2) Train YOLOv3, and (3) Detect objects for: one image, images, video, webcam, or ROS topic.

3D-Scanner-by-Baxter

Use a robot arm (Baxter) mounted with a depth camera to scan an object's 3D model.

practice_motion_planning

Coding: ①Path planning: RRT*, A*; ② Tracking: Optimization, PurePursuit, FollowLine. ③Planning and control on a mobile manipulator

ros_openpose_rgbd

Visualize 3d humans' skeletons(body+hands) in ros rviz. The 2d joints are detected by openpose; The depth is from depth image.

ros_3d_pointing_detection

Which object a person is pointing at? Detect it by using YOLO, Openpose and depth image (under customized scene).

Speech-Commands-Classification-by-LSTM-PyTorch

Classification of 11 types of audio clips using MFCCs features and LSTM. Pretrained on Speech Command Dataset with intensive data augmentation.

Jupyter Notebook

ros_detect_planes_from_depth_img

A python node to detect planes from depth image by using RANSAC algorithm. Input/Output from/to ROS topics.

Detect-Object-and-6D-Pose

(1) 3D scan object by Baxter. (2) Label objects automatically by depth camera and (3) train Yolo. (4) [TODO; NOT DONE YET!!!] Finally, detect object and fit 3D model to know the 6D pose.

Mask-Objects-from-RGBD

Put objects on a plane. Use depth camera to find them and add label (for training Yolo).

API_for_Simulating_Multi-Link_System

Mathematica API for simulating the dynamics and collision of planar multi-link objects (by Euler-Lagrange equation).

Detect-Hand-Grasping-Object

A toy project: Detect my hand grasping object in the video. Backbone algorithms: SiamMask, Mask_RCNN, OpenPose

Jupyter Notebook

Data-Storage

Store some images, gifs, etc.

ros_pub_and_sub_rgbd_and_cloud

Python nodes to publish/subscribe RGB-D images and their point clouds (or any of them) to/from ROS topics.

record_images_from_usbcam

Run one script and press 's'/'d' to save your laptop's camera images to disk. Two versions: (1) Python, and (2) ROS node.

cpp_practice_image_processing

Implement: Sobel; Canny; Harris; Hough line; Fit line; RANSAC.

Command_Robot_to_Move

Use voice to tell robot the target, then the robot detects it and moves there. (LSTM, YOLO, Plane detection, Motion planning, ...)

Jupyter Notebook

ros_turtlebot_control

ROS services for controlling Turtlebot3 to target pose by `Move to Pose` algorithm.

ros_speech_commands_classification

(1) Press key to record audio; (2) Speak a word to microphone; (3) Finally, see the classification result on GUI and ROS topic.

Voice_Control_Turtlebot--Masters_Final

A toy project of using voice to tell a Turtlebot Robot to detect and move to target, achieved by 4 components (1) speech classification, (2) object detection, (3) plane detection, and (4) control of wheel motion.

DQN_SwingUpPendulum

Using Deep Q-network to train an AI to play swing-up pendulum game

ros_record_rgbd_images

Press key to record color/depth images from ROS topics or Realsense. Key `a` for saving single image; `s` for starting continuous recording; `d` for stop recording. `q` for quit.

ros_images_publisher

A python script to publish color or depth images from a folder to ROS topic.

Monocular-Visual-Odometry-Data

Only my VO project's test data and results

Baxter_Picks_Up_Dices

A readme for the CV in ME495's final project “Baxter Robot picking up dices”. In short, (1) detecting dices using graph cut algorithm, and (2) locating their pos by geometry.

keyboard_input

4 functions for reading keyboard input : Read char or string; With or without time out.