Author Image

Hi, I am Arnab

Arnab Sen Sharma

Ph.D. Student at Khoury College of Computer Sciences, Northeastern University.

I am a Ph.D. student at the Northeastern University, Boston. I am fortunate to be advised by Prof. David Bau. At the Interpretable Neural Networks lab we are doing exciting research on understanding the internal structure of large language models.

I received my B.Sc. in Computer Science and Engineering from Shahjalal University of Science and Technology. I worked as a research assistant at the SUST NLP lab during my final two semesters. I worked at the Samsung R&D institute at Bangladesh for a year after graduation. There I worked on optimizing the chromium engine for the browser application in Samsung wearables. I was also part of a robotics project that won 3rd place at the Samsung ROBOT Hackathon-2018. I came back to SUST to join as a Lecturer in the Computer Science department. There I was involved in multiple collaborative research works in NLP and Data Science.

Publications

Mass-Editing Memory in a Transformer
ArXiv PreprintSubmitted September 28, 2022

Kevin Meng, Arnab Sen Sharma, Alex Andonian, Yonatan Belinkov, and David Bau. Submitted in ICLR 2023.

Augmenting Online Classes with an Attention Tracking Tool May Improve Student Engagement

Arnab Sen Sharma, Mohammad Ruhul Amin, and Muztaba Fuad.

BD-SHS: A Benchmark Dataset for Learning to Detect Online Bangla Hate Speech in Different Social Contexts
ArXiv PreprintSubmitted December 3, 2021

Nauros Romim, Mosahed Ahmed, Md Saiful Islam, Arnab Sen Sharma, Hriteshwar Talukder, Mohammad Ruhul Amin. “BD-SHS A Benchmark Dataset for Learning to Detect Online Bangla Hate Speech in Different Social Contexts”. Prested as a poster in LREC 2022

Challenges in Tracking the Risk of COVID-19 in Bangladesh; Evaluation of A Novel Method

Md. Enamul Hoque, Md. Shariful Islam, Arnab Sen Sharma, Rashedul Islam, and Mohammad Ruhul Amin. 2021. “Challenges in Tracking the Risk of COVID-19 in Bangladesh; Evaluation of A Novel Method”. In Proceedings of August 15 (KDD Workshop on Data-driven Humanitarian Mapping, 27th ACM SIGKDD Conference). ACM, New York, NY, USA.

Presenting a Larger Up-to-Date Movie Dataset and Investigating the Effects of Pre-Released Attributes on Gross Revenue
Journal of Computer ScienceSubmitted: 16 June, 2021; Published: 13 October, 2021

Sharma, A. S., Roy, T., Rifat, S. A. & Mridul, M. A. (2021). Presenting a Larger Up-to-Date Movie Dataset and Investigating the Effects of Pre-Released Attributes on Gross Revenue. Journal of Computer Science, 17(10), 870-888. https://doi.org/10.3844/jcssp.2021.870.888

Automatic Detection of Satire in Bangla Documents; A CNN Approach Based on Hybrid Feature Extraction Model

A. S. Sharma, M. A. Mridul and M. S. Islam, “Automatic Detection of Satire in Bangla Documents; A CNN Approach Based on Hybrid Feature Extraction Model”, 2019 International Conference on Bangla Speech and Language Processing (ICBSLP), Sylhet, Bangladesh, 2019, pp. 1-5, doi:10.1109/ICBSLP47725.2019.201517. [IEEE Best Paper Award, awarded by ICBSLP, 2019]

A Deep CNN Model for Student Learning Pedagogy Detection Data Collection Using OCR

A. Sen Sharma, M. Ahmed Mridul, M. Jannat and M. Saiful Islam, “A Deep CNN Model for Student Learning Pedagogy Detection Data Collection Using OCR,” 2018 International Conference on Bangla Speech and Language Processing (ICBSLP), Sylhet, 2018, pp. 1-6, doi:10.1109/ICBSLP.2018.8554701.

Experiences

1
Lecturer
Department of Computer Science and Engineering,
Shahjalal University of Science and Technology (SUST)

July, 2019 - Present, Sylhet

Kindly checkout my faculty profile at SUST here.

Responsibilities:
  • Taught courses include,
    • Software Engineering and Design Patterns (CSE 331-332)
    • Database Management Systems (CSE 333-334)
    • Web Technologies (CSE 446)
    • Introduction to Programming - Python (CSE 119_)
    • Introduction to Computing Applications (CSE 202_)
  • Served as a member of Undergrad Final Year Thesis/Project Evaluation Committee, Session 2019‐20, 2020‐21

Research Associate
Natural Language Processing research groups, SUST

August, 2019 - Present, Sylhet

Responsibilities:
  • Member of a panel that wrote a project proposal entitled ”A BiLSTM based Deep Learning Approach for Hate Speech Detection in Bangla Text and a Benchmark Dataset for Hate Speech in Bangla”. The proposed project won a research grant worth 2,30,000 BDT from SUST Research Center (Project ID; AS/2020/1/26).
  • Advising undergrad students who are conducting research in Data Science and Machine Learning related domains.
  • Conducted workshops on basic Data‐Science and Machine Learning to train up undergrad students of CSE department, SUST for their final year research work.
2

3
Software Engineer I
Samsung R&D Institute, Bangladesh

August, 2018 - July, 2019, Dhaka

Responsibilities:
  • Worked on a Robot Simulation Project.
    • Coded the core features of Object Detection for robot navigation using YOLOv3.
    • Baby-Crying sound detection on Android devices using Tensorflow-lite.
  • Implemented some features of Chromium-efl for Tizen wearable (C++).
  • Designed and developed apps for Android platform (Java).
  • Designed and developed apps for Tizen platform (C++).
  • Built sensor data collection apps for Android Smartphones and Tizen watches.
  • Research on unique gesture detection on Android Smartphones and Tizen watches.

Education

B.Sc.(Engg) in Computer Science & Engineering
CGPA: 3.91 out of 4.00 (Topper in my class)
Taken Courses Include
Course NameTotal CreditObtained Grade Point
Machine Learning4.54.00/4.00
Artificial Intelligence54.00/4.00
Structured Programming Language64.00/4.00
Discrete Mathematics4.54.00/4.00
Data Structures54.00/4.00
Algorithm Design and Analysis4.54.00/4.00
Object Oriented Programming Language64.00/4.00
Numerical Analysis3.54.00/4.00
Basic Statistics and Probability43.75/4.00
Database Systems63.875/4.00
Operating System and System Programming4.54.00/4.00
Communication Engineering4.54.00/4.00
Microprocessor and Interfacing4.54.00/4.00
Computer Networking4.54.00/4.00
Computer Graphics and Image Processing4.54.00/4.00
Compiler Construction4.54.00/4.00
Digital Signal Processing4.54.00/4.00
Bioinformatics4.54.00/4.00
Extracurricular Activities
  • Had an active and distinguished career as a Competitive Programmer. Participated in many national and international onsite programming contests. Some notable contest performances are mentioned below.
    • ACM ICPC ASIA Regional Dhaka Site 2017, Rank 21, (SUST_PeakSeeker)
    • ACM ICPC ASIA Regional Dhaka Site 2016, Rank 11, (SUST_DeapThunder)
    • IUBAT IUPC, Rank 4 , (SUST_DeapThunder)
    • BSCCL National Programming Contest (Hackathon) at UITS, Rank 6, (SUST_1)
    • LU Inter University Programming Contest 2016, Rank 4, (SUST_BlackJAM)
  • Arranged various competitive programming workshops for junior programmers of SUST.
  • Was appointed as competitive programming trainer at Sylhet International University.
Higher Secondary School Certificate
GPA: 5 out of 5
Moulvibazar Govt. High School
2003-2010
Secondary School Certificate
GPA: 5 out of 5

Projects

COVID-19 in Bangladesh
Lead Developer March 2021 - Present

An interactive website built with ReactJS and Flask to showcase our work on tracking and forecasting COVID‐19 situation in Bangladesh. The site is live at erdos.dsm.fordham.edu:3000. Please check the site from a desktop/laptop/tab. The site is not optimized for vertical smartphone screens.

Attention Checker
Researcher June 2021 - Present

Calculating the attention score of an online class based on students’ gaze tracking, under supervision of Dr. Mohammad Ruhul Amin, Assistant Professor, Fordham University.

Sentece Similarity Measurement for Bangla
Researcher March 2021

Experimented with Siamese Neural Network architecture to calculate a similarity score between 2 sentence written in Bangla.

Algorithm Simulator
Developer 2016

A desktop application aiming to help newbie competitive programmers to get a basic understanding of vastly used basic algorithms, data structures and different kinds of sorting techniques.

DareIt?!
Developer 2016

An arcade game for android platforms.

Travel Santa
Developer 2016

A web application aiming to help tourists to find hotels and restaurants near tourist spots.

Shurodhwani
Developer 2017

An online audio sharing and listening platform.

EduHelper
Developer 2017

A web application aiming to help instructors in making and sharing tutorials.

Box Office Revenue Analysis
Researcher April 2021 - June 2021

Prepared a huge new dataset by scraping different sites. Performed statistical analysis to find out how revenue generated in box-office is related to different pre-released attributes.

Skills

Accomplishments

3rd place at SAMSUNG Research ROBOT HACKATHON-2018
Samsung Research December 2018

This competition allowed participants to work with freshly manufactured lines of robots by Samsung. I participated in that competition with an idea called BFF Robot as part of the team SRBD_Hyperbola. The idea was that a robot would become the daily companion of a child, adapting to the child as s/he grows; eventually, the child and the robot would become Best Friends Forever! We secured 3rd place with the global participation of more than 500 teams, and won a prize money worth of 1000 USD.

Samsung Professional

Internal capability certification of Samsung Electronics. Attained professional certificate at my second try. I was the only one who attained the certificate in the test held on 24 November, 2018.

Completed Online Courses

DS102X: Machine Learning for Data Science and Analytics
ColumbiaX, edX February 25, 2021

In this course I learned,

  • What machine learning is and how it is related to statistics and data analysis
  • How machine learning uses computer algorithms to search for patterns in data
  • How to use data patterns to make decisions and predictions with real-world examples from healthcare involving genomics and preterm birth
  • How to uncover hidden themes in large collections of documents using topic modeling
  • How to prepare data, deal with missing data and create custom data analysis solutions for different industries
  • Basic and frequently used algorithmic techniques including sorting, searching, greedy algorithms and dynamic programming
DS101X: Statistical Thinking for Data Science and Analytics
ColumbiaX, edX February 7, 2021

In this course I learned,

  • Data collection, analysis and inference
  • Data classification to identify key traits and customers
  • Conditional Probability-How to judge the probability of an event, based on certain conditions
  • How to use Bayesian modeling and inference for forecasting and studying public opinion
  • Basics of Linear Regression
  • Data Visualization:- How to create use data to create compelling graphics
DL0101EN: Deep Learning Fundamentals with Keras
IBM, edX January 19, 2021

In this course I learned,

  • Some exciting applications of deep learning and why it is really rewarding to learn how to leverage deep learning skills.
  • Neural networks and how they learn and update their weights and biases.
  • Vanishing gradient problem and how to avoid/handle it.
  • Building a regression model using the Keras library.
  • Building a classification model using the Keras library.
  • Supervised deep learning models, such as convolutional neural networks and recurrent neural networks, and how to build a convolutional neural network using the Keras library.
  • Unsupervised learning models such as autoencoders.
Mathematics for Machine Learning: Linear Algebra

1st part of Mathematics for Machine Learning series. In this course I learned about the basics of Linear Algebra and how to apply these concepts to Machine Learning. I learned about Eigenvalues and Eigenvectors, Transformation Matrix, the workings of simple Page-Ranking algorithms etc. in this course.

Grade achieved:  96%

Mathematics for Machine Learning: Multivariate Calculus

2nd part of Mathematics for Machine Learning series. In this course I learned about basics of Linear Regression, Vector Calculus, Multivariable Calculus, Gradient Descent etc.

Grade achieved:  96.75%

Deep Learning: Face Recognition
Lynda May, 2019

Learned the steps involved in coding facial feature detection, representing a face as a set of measurements, and encoding faces. Also learned how to repurpose and adjust pre-existing systems.

Building Your First iOS App
Lynda March, 2019

Was required to implement an iOS application for iPhones for sensor data collection when I was working at SRBD. Also implemented/translated some modules of existing android apps for iOS.

iOS 12 Development Essential Training 1: Fundamentals, UI, and Architecture
Lynda April, 2019

Was required to implement an iOS application for iPhones for sensor data collection when I was working at SRBD. Also implemented/translated some modules of existing android apps for iOS.

Learning C++ Pointers
Lynda March, 2019

Took this course when I was understanding the code-flow of V8 engine, the module of chromium browsers that compiles JavaScript. Refreshed my knowledge about c++ pointers.

Big Data Foundations: Techniques and Concepts
Lynda March, 2019

Took this course out of self-interest.

Learning Django
Lynda March, 2019

Took this course out of self-interest.

Swift 4 Essential Training
Lynda December, 2018

Misc