Early Exploration of Natural Language Processing and Sentiment Analysis (2010–2011)

During my undergraduate studies around 2010–2011, I developed a series of experimental projects focused on natural language processing (NLP), sentiment analysis, and text mining using PHP and MySQL.

My goal was to explore a simple but challenging question:

How can a computer extract meaning and sentiment from human language?

At the time, modern machine learning frameworks and large language models were not widely available. Instead, I implemented a rule-based approach that combined text preprocessing, feature extraction, scoring algorithms, and simple learning mechanisms.

Text Preprocessing and Normalization

One of the first challenges I encountered was that computers treat different forms of the same word as separate entities. To address this, I implemented custom normalization and stemming rules:

$post = str_replace('ness','',$post);
$post = str_replace('ing','',$post);
$post = str_replace('ies','y',$post);
$post = str_replace('ied','y',$post);

This allowed the system to reduce words such as “happiness”, “happily”, and “happy” to a common root, improving sentiment detection accuracy.

Context-Aware Sentiment Analysis

I also discovered that sentiment analysis required contextual understanding. For example, the phrase “not good” cannot be treated the same as “good”.

if($previousword != 'not')
{
    $score++;
}

While simplistic, this introduced the concept of contextual sentiment analysis into the system and highlighted the limitations of purely dictionary-based approaches.

Feature Weighting

Another challenge was determining that some words carry more emotional significance than others. To account for this, I introduced weighted scoring:

if($k == 1)
    $score += 50;
else
    $score += 1;

This allowed strongly emotional terms to influence classification decisions more heavily than neutral words.

Learning and Knowledge Accumulation

The system also maintained frequency counts and updated confidence levels based on repeated observations:

UPDATE emotionsub
SET count = count + 1

This represented an early attempt to incorporate experience and feedback into the classification process.

Evaluation and Testing

Rather than relying solely on intuition, I tested the system against datasets and measured classification accuracy to evaluate improvements over time. This process introduced me to the importance of empirical evaluation and iterative refinement.

Reflection

Looking back, these projects were not machine learning systems in the modern sense. However, they allowed me to explore many foundational concepts in natural language processing, including:

  • Text normalization
  • Stemming and lexical processing
  • Feature extraction
  • Sentiment classification
  • Context handling
  • Feature weighting
  • Knowledge storage
  • Empirical evaluation

Although the implementation techniques used in modern AI have evolved significantly, this experience developed my interest in understanding how machines process language and solve complex problems through structured analysis and iteration.