Documentation
SROSP Framework
Next generation framework for social robot developemnet
- Version: 1.0
- Author: Mahta Akhyani
- Created: 15 September, 2023
- Update: 20 September, 2023
If you have any questions that are beyond the scope of this help file or any suggestions, Please feel free to email via Email.
Features
The overall system architecture consists of four main layers:
- ROS Layer
- Web Server Layer
- Android Layer
- Hardware Layer
The schematic of the system architechture is as shown in the figure below.
Django Web Application
Django web application is responsible for communicating between the user interface and the database or the API endpoints.
It also serves the static files (css, Js, img, etc.) and the html templates.
The static files are where the user interface communicates directry with ROS server through a library called If you are here from the locally deployed version of SROSP, please skip to the Deploy section for running the app. Follow the steps below to install the project: 1. 2. 3. All of the following are EmotionModel serializers:rosbridge js
.
Installation
Python 3.7
or higherpip
django-rest-framework
rosbridge
websockets
OSSRP/cleancoded/webInterface/interface_backend
python manage.py makemigrations
python manage.py migrate
python manage.py runserver
http://localhost:8000/index.html
to access the robot control user interface.
You will be prompted to login as you choose the robot panel.
If there is none, click on the "add" button, fill the form that opens and save it.
Rename your robot's GUI's HTML file as you named it in the form you just filled and copy it to /OSSRP/cleancoded/webInterface/interface_backend/core/templates
and the static files in /core/statics
. Now refresh the page.
File Structure
interface_backend/
contains the main projectcore/
contains the main app -> handles the main user interface and between-device communication (a.k.a. the android app)soundHandler/
contains the soundHandler app -> handles sound related modelsserialHandler/
contains the serialHandler app -> handles serial communication related modelsstatic/
contains the static files (css, js, images, etc.)templates/
contains the html templatesdb.sqlite3
is the database filemanage.py
is the main file for running the serverrequirements.txt
contains the required packages for the projectApps & API Endpoints
core/views.py
contains the main API endpointssoundHandler/views.py
contains the API endpoints for the soundHandler appserialHandler/views.py
contains the API endpoints for the serialHandler appModels
core/models.py
contains the main models:
EmotionModel
face
-> name of the emotionface_video_url
-> url of the video for the emotionvideo_file
-> file of the video for the emotionsound
-> a foreign key field to that of the Song model in soundHandler app (the sound to play with the video simultaneously)interface_button_emoji
-> emoji for the emotion (used in the user interface)
dynatype
-> type of the dynamixel motormovement
-> True if the motor must move, False otherwisedir
-> direction of the movement (0 for (up - right) or 1 for (down - left))pos_up
-> Neck pitch turnpos_right
-> Neck yaw turnright_hand
-> Right hand final positionleft_hand
-> Left hand final positionspeed
-> Speed of the movementtheta
-> Angle of the movementyaw
-> Yaw of the movementsoundHandler/models.py
contains the models for the soundHandler appSong
:
title
-> title of the sound filedescription
-> description of the sound fileaudio_file
-> the sound file (optional)audio_link
-> the link to the sound file (optional)duration
-> the duration of the sound fileSerializers
EmotionModelSerializer
HooshangDynaSerializerHead
HooshangDynaSerializerHands
HooshangDynaSerializer
URLs
- Go to
http://localhost:8000/index.html
to access the robot control user interface - Go to
http://localhost:8000/admin
to access the admin interface - Go to
http://localhost:8000/wizard/setup
to access the setup wizard - The following are the API endpoints:
/reqpub
->GET
: loads the main user interface,POST
: recieves movement commands and sends them to the robot/reqcli
->GET
: passes the user-selected emotion's data to the robot ,POST
: recieves the robot's response/reqemo
->GET
: sends back the video url of the selected sound, or the sound url of the selected face (video)/reqip
->GET
: sends back the IP address of the robot (fetchs server local ip address)
Deploy
You need a web server to run the project. I used nginx and gunicorn. You can use apache or any other web server.
To use the built-in django web-server, run the following command: python manage.py runserver
. However, the server must have automaticly been served by the web server (e.g. nginx) in production environment (e.g. Jetson Nano) and gunicorn service & socket in the background on each reboot.
If not, check nginx status to make sure it is active and then check for errors in the log file: /var/log/nginx/error.log
and /var/log/nginx/access.log
. If you are using a different web server, check its log file.
Also this could have happened because of failed gunicorn service. Check the status of gunicorn service with the following command:
sudo systemctl status gunicorn
and check gunicorn configuration file: /etc/systemd/system/gunicorn.service
and gunicorn.socket configuration file: /etc/systemd/system/gunicorn.socket.
NOTE *Make sure you have allowed the ports in use (e.g. 1935, 5353, 8080, etc.) in the firewall. To check the status of the firewall, run the following command:
sudo ufw status
. To allow a port, run the following command: sudo ufw allow [port number]
. To allow a range of ports, run the following command: sudo ufw allow [port number]:[port number]
Installation
- Install nginx
- Install gunicorn
- Create a new user for the project
- Create a new directory for the project
- Clone the project into the new directory
- Create a new virtual environment for the project
- Install the requirements for the project
- Create a new configuration file for the project in
/etc/nginx/sites-available/
- Create a new configuration file for the project in
/etc/supervisor/conf.d/
- Create a new configuration file for the gunicorn service in
/etc/systemd/system/
- Setup the gunicorn service
- Setup nginx
14. Use the following commands for Gunicorn:
sudo systemctl start gunicorn
sudo systemctl enable gunicorn
sudo systemctl status gunicorn
sudo systemctl stop gunicorn
sudo systemctl restart gunicorn
sudo systemctl daemon-reload
sudo systemctl reload gunicorn
sudo systemctl disable gunicorn
15. Use the following commands for nginx:
sudo systemctl daemon-reload
sudo systemctl restart nginx
sudo systemctl status nginx
sudo systemctl stop nginx
sudo systemctl start nginx
sudo systemctl reload nginx
sudo systemctl disable nginx
sudo systemctl enable nginx
Notes
*If you want to be able to access the user-interface from other devices in the same network, you should use the same url but replace the IP address with the IP address of the main server (e.g. Jetson). You can also use a domain name instead of the IP address.*
*Add the following line to nginx sites-enabled/[my website's name] configuration file because your media files will be ignored by the browser if they are served with the default mimetype of text/plain : include /etc/nginx/mime.types;
*
*If requirements.txt does not exist, run pipreqs [path/to/ project]
to generate requirements.txt file for your project and install the requirements with pip install -r requirements.txt
*
*The user-interface must be accessible from browser with the following address:http://[Jetson|Computer's IP address]:[port number(default:5353)]/index.html
*
Additions to the project
If you want to assign a static IP address to the Jetson Nano, you can use the following command:
sudo nmtui
To check the IP address of the Jetson Nano, run the following command:
hostname -I
To check the IP address of the computer, run the following command:
ipconfig
If you want to deploy the project or assign a purchased domain name to the project, you must change the server_name in the nginx configuration file from localhost to the domain name. Be sure to configure the DNS settings of the domain name to point to the IP address of the server using cloudflare or any other DNS service provider.
Daemons
To run the project from scratch, run the following command:
gunicorn --bind host:port --workers 3 --threads 2 --timeout 120 --log-level debug --log-file log.txt --access-logfile access_log.txt --error-logfile error_log.txt --capture-output --enable-stdio-inheritance --daemon --pid pid.txt --pythonpath [path/to/project] [project_name].wsgi:application
DataBase
type of the database: sqlite3
location of the database: [path/to/project]/db.sqlite3
To access the database, you need to be a superuser. To create a superuser, run the following command:
python manage.py createsuperuser
Now you can access the database at /admin
with admin privileges. (e.g. modify model objects, add new users, etc.)
Android
The Android device is currently only responsible for acting as the robot's face and sound player. In other terms, it acts as a simple multimedia player which is also in constant communication with the ROS and Django servers.
However, based on the modular structure of the whole system, it has been a vision for future developements to integrate the camera and the microphone data input of the Android device as well.
In that case, ROSJava
would be needed to be implemented in the android app. It would be in charge of publishing the camera and microphone audio data inputs on the respected topics (which already exist).
- Language: Kotlin
Download APK and Source Code
You can download the latest version of the app from here.
You can also access the source code of the app from here.
Supported Systems
*This app can be a little tricky if using Android=<4.4.0 because it does not support some of the newer streaming protocols. Some media might fail to play.
User Interface
You must first fill in the url with the [IP address:port number]
of the main server (e.g. Jetson). No need for the /reqcli
part.
*NOTE: http://
is required.
Then press the Check button on the bottom. Make sure the status is OK before going into full screen mode.
Note that if the status is OK but you see no URL provided, there is no problem and the client probably hasn't requested any changes yet. Move on.
When the screen goes into full screen mode, there's a small lock icon on the top right corner of the screen. If you wish to lock the medial player controls to prevent any sudden exits or block accidental touches, click on the lock icon.
You can always release the controls by touching the lock icon again.
Multimedia
The app's internal media player is Exo Player
.
If you want to stream a video or audio file from the server, you must first put the file in the media
folder of the project and then use the following url to access it: http://[IP address:port number]/media/[file name].[file extension]
Additionaly, you can online stream a video or audio by sending the url of the stream server to the androind app, just like you would do with a local file. (It is recommended to use a local server for streaming though for faster response time and less bandwidth usage.)
The app is capable of playing any video or audio file that is supported by the android device (e.g. mp4, mp3, wav, etc.) and any video or audio stream that is supported by the android device (e.g. rtsp, rtmp, etc.)
It caches the played media at the first time to save time and bandwidth for the next of the session.
Web Server
When the start button is clicked for the first time, the android app starts listening on [user-given url:port number]/reqcli
(in this case: ip:5353/reqcli) for requests from the user-interface.
It then waites for an update in the json it recieves as a GET request and then changes the media output accordingly.
It looks for any changes in "audio_url" and "video_url" fields of the json and plays each media through a seprarate channel at the same time; Meaning that the user is not limited to playing video's audio only and can play any audio file with any video file.
After each action, the android app sends a POST
request with the body being the status of the action (e.g. "Error!", "Video played successfuly", etc.) All of which are then recieved by the user-interface(Back-end => Django: Views.py
) and displayed to the user(front-end => index.html
).
ROS
Installation
Follow the steps below to install the project:
- Install ROS Melodic or higher
- Install the following packages:
-
numpy==1.25.1
-
matplotlib==3.7.2
-
scipy==1.7.3
-
pandas==2.0.3
-
tensorflow==2.6.0
-
torch
-
openpyxl==3.1.2
-
Pillow==10.0.0
-
pyaudio==0.2.13
-
opencv-python
-
tabulate==0.9.0
-
Image==1.5.33
-
mediapipe==0.10.2
-
deepface==0.0.79
-
glob2==0.7
-
pyax12
-
argparse
-
transformers
-
torch
-
hazm
-
deep_translator
-
edge_tts
-
audio_common_msgs
-
- Clone the repository and go to
OSSRP/cleancoded/
- Run the following commands:
-
catkin_make source devel/setup.bash
- Now to run the ROS nodes, run the following command:
roslaunch infrastructure init_robot.launch
This will initiate all the available nodes. You can also initiate other node combinations separately by running the following commands:roslaunch infrastructure initiate_facial.launch
roslaunch infrastructure gaze_pose.launch
roslaunch infrastructure speech_emotion_analysis.launch
roslaunch infrastructure speech_to_text.launch
~/.ros/log/latest/
for the error message.
If you would like to run a node separately, run the following command:rosrun infrastructure [node_name]
-
Nodes
List of available ROS nodes:
opencv_client
-> converts the Image messages to OpenCV messages and publishes them in the form of several list of lists (List.msg)
, since ROS doesn't support 3D Array messagesaudio_recorder
-> captures the mic's input via PyAudio library and broadcasts it asAudio_common_msgs
speech_emotion_analysis_server
-> uses pre-trained audio features for sentiment analysis to analyze the speech's emotion, then publishes the results in custom messages(EmoProbArr.msg
)audio_feature_extractor
-> uses PRAAT library to extract 10 features of the audio, then publishes the result as custom messages(AudFeatures.msg
)FaceEmotionAnalysis
-> uses DeepFace library to analyze the face emotions and their probablities, then publishes the results as custom messages(FaceEmotions.msg
).landmark_detection
-> uses Google's MediaPipe library to extract full-body landmarks, the publishes the results as custom messages(Landmarks.msg
)gaze_detector
-> uses the face/head landmarks to estimate the head and gaze position. Then returns the result as a single String message.gaze_pose_node
-> calls thegaze_detector
service and publishes its responsesspeech_to_text_server
-> takes in the audio data asAudio_common_msgs
and returns the transcript as a single String.text_to_speech_server
-> Takes a text as a single String message and returns spoken data asAudio_common_msgs
speech_to_text_node
-> calls thespeech_to_text_server
service and publishes its responses
List of developed ROS launch files:
init_robot.launch
-> initiates all the available nodesinitiate_facial.launch
-> initiates all the face/head-related nodesgaze_pose.launch
-> initiates thegaze_detector
service and thegaze_pose_node
speech_emotion_analysis.launch
-> initiates theaudio_recorder
andspeech_emotion_analysis_server
speech_to_text.launch
-> initiates thespeech_to_text_server
andspeech_to_text_node
Messages
List of developed ROS msg files:
AudFeatures.msg
-> 10 float64 items as 10 audio featuresEmoProbArr.msg
-> emotion's name as String and the probablity of it as float32Array3D.msg
-> a list of float64sList.msg
-> a list ofArray3D
sLandmarks.msg
-> 4 lists ofgeometry_msgs/Point
s as face, right and left hand, and the pose landmarksFaceEmotions.msg
-> a list ofEmoProbArr
s
Services
-
List of developed ROS Services:
audio_features
-> same asaudio_feature_extractor
nodespeech_emotion_analysis
-> same asspeech_emotion_analysis_server
nodespeech_to_text
-> same asspeech_to_text_server
nodegaze_pose
-> same asgaze_detector
nodetext_to_speech
-> same astext_to_speech_server
node-
List of developed ROS srv files:
AudFeature.srv
-> takesAudio_common_msgs
and returnsAudFeatures
EmoProb.srv
-> takesAudio_common_msgs
and returnsEmoProbArr
Gaze.srv
-> takesList.msg
(as the camera frame) and code>Landmarks, and returns a StringStt.srv
-> takesAudio_common_msgs
and returns a StringTts.srv
-> takes a String and returnsAudio_common_msgs
Topics
List of developed ROS Topics:
-
Image Processing
/image_raw
-> raw image from camera [Sensor_msgs/Image
]/camera_info
-> width, height, distortion, etc. [Sensor_msgs/CameraInfo
]/image_cv2
-> image converted to cv2 [List.msg
]/image_raw/landmarked
-> image with landmarks in ROS Image format [Sensor_msgs/Image
]/image_cv2/landmarked
-> image with landmarks in cv2 format [List.msg
]/gaze_pose
-> gaze pose in String format [Std_msgs/String
]-
Audio Processing
/audio_features
-> audio features in custom msg format [AudFeatures.msg
]/speech_emotion_analysis
-> emotion probabilities in custom msg format called [EmoProbArr.msg
]/captured_audio
-> audio captured from microphone in ROS AudioData format [Audio_common_msgs
]/transcript
-> transcript of speech in string format [Std_msgs/String
]
In the following section each message type in the brackets with a ".msg" and also without a package name is pointing to one of our custom messages described carefully in the list of developed msg files.
Lorem Ipsum is simply dummy text
Inline Text elements
You can use the mark tag to highlight text.
This line of text is meant to be treated as deleted text.
This line of text is meant to be treated as no longer accurate.
This line of text is meant to be treated as an addition to the document.
This line of text will render as underlined
This line of text is meant to be treated as fine print.
This line rendered as italicized text.
Embedded Video
Wrap any embed like an <iframe>
in a parent element with .embed-responsive
and an aspect ratio. The .embed-responsive-item
isn’t strictly required, but we encourage it.
<div class="embed-responsive embed-responsive-16by9">
<iframe class="embed-responsive-item" src="https://www.youtube.com/embed/7e90gBu4pas" allowfullscreen></iframe>
</div>
NotePlease go to official bootstrap documentation for a full information of embed video: Bootstrap Documentation
Popup with Video
Show Youtube and Vimeo video popup when click on link:
<a class="popup-youtube" href="http://www.youtube.com/watch?v=7e90gBu4pas">Open Popup YouTube Video </a>
FAQ
A FAQ is a list of frequently asked questions (FAQs) and answers on a particular topic.