Package org.cis1200

Class TwitterBot

java.lang.Object
org.cis1200.TwitterBot

public class TwitterBot extends Object
This is the class where everything you've worked on thus far comes together! You can see that we've provided a path to a CSV file full of tweets and the column from which they can be extracted. When run as an application, this program builds a Markov Chain from the training data in the CSV file, generates 10 random tweets, and prints them to the terminal.

This class also provides the writeTweetsToFile method, which can be used to create a file containing randomly generated tweets.

Note: All IOExceptions thrown by writers should be caught and handled properly.

  • Field Summary

    Fields
    Modifier and Type
    Field
    Description
    (package private) MarkovChain
    The MarkovChain you'll be using to generate tweets
    (package private) NumberGenerator
    RandomNumber generator to pick random numbers
    (package private) static final String
    File to store generated tweets
    (package private) static final String
    This is a path to the CSV file containing the tweets.
    (package private) static final int
    Column in the PATH_TO_TWEETS CSV file to read tweets from
  • Constructor Summary

    Constructors
    Constructor
    Description
    TwitterBot(BufferedReader br, int tweetColumn)
    Given a column and a buffered reader, initializes the TwitterBot by training the MarkovChain with sentences sourced from the reader.
    TwitterBot(BufferedReader br, int tweetColumn, NumberGenerator ng)
    Given a column and a buffered reader, initializes the TwitterBot by training the MarkovChain with all the sentences obtained as training data from the buffered reader.
  • Method Summary

    Modifier and Type
    Method
    Description
    void
    Modifies all MarkovChains to output sentences in the order specified.
    int
    fixPunctuation(char punc)
    A helper function to return the numerical index of the punctuation.
    generateTweet(int numWords)
    Generates a tweet of a given number of words by using the populated MarkovChain.
    generateTweetChars(int numChars)
    Generates a tweet using generateTweet().
    generateTweets(int numTweets, int numChars)
    Generates a series of tweets using generateTweetChars().
    static boolean
    A helper function to determine if a string ends in punctuation.
    boolean
    Returns true if the passed in string is punctuation.
    static void
    main(String[] args)
    Prints ten generated tweets to the console so you can see how your bot is performing!
    A helper function for providing a random punctuation String.
    void
    writeStringsToFile(List<String> stringsToWrite, String filePath, boolean append)
    Given a List of Strings, prints those Strings to a file (one String per line in the file).
    void
    writeTweetsToFile(int numTweets, int numChars, String filePath, boolean append)
    Generates tweets and writes them to a file.

    Methods inherited from class java.lang.Object

    clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
  • Field Details

    • PATH_TO_TWEETS

      static final String PATH_TO_TWEETS
      This is a path to the CSV file containing the tweets. The main method below uses the tweets in this file when calling Twitterbot. If you want to run the Twitterbot on the other files we provide, change this path to a different file. (You may need to adjust the TWEET_COLUMN too.)
      See Also:
    • TWEET_COLUMN

      static final int TWEET_COLUMN
      Column in the PATH_TO_TWEETS CSV file to read tweets from
      See Also:
    • PATH_TO_OUTPUT_TWEETS

      static final String PATH_TO_OUTPUT_TWEETS
      File to store generated tweets
      See Also:
    • mc

      The MarkovChain you'll be using to generate tweets
    • ng

      RandomNumber generator to pick random numbers
  • Constructor Details

    • TwitterBot

      public TwitterBot(BufferedReader br, int tweetColumn)
      Given a column and a buffered reader, initializes the TwitterBot by training the MarkovChain with sentences sourced from the reader. Uses the RandomNumberGenerator().
      Parameters:
      br - - a buffered reader containing tweet data
      tweetColumn - - the column in the reader where the text of the tweet itself is stored
    • TwitterBot

      public TwitterBot(BufferedReader br, int tweetColumn, NumberGenerator ng)
      Given a column and a buffered reader, initializes the TwitterBot by training the MarkovChain with all the sentences obtained as training data from the buffered reader.
      Parameters:
      br - - a buffered reader containing tweet data
      tweetColumn - - the column in the buffered reader where the text of the tweet itself is stored
      ng - - A NumberGenerator for the ng field, also to be passed to MarkovChain
  • Method Details

    • writeStringsToFile

      public void writeStringsToFile(List<String> stringsToWrite, String filePath, boolean append)
      Given a List of Strings, prints those Strings to a file (one String per line in the file). This method uses BufferedWriter, the flip side to BufferedReader. Ensure that each tweet you generate is written on its own line in the file produced.

      You may assume none of the arguments or strings passed in will be null.

      If the process of writing the data triggers an IOException, you should catch it and stop writing. (You can also print an error message to the terminal, but we will not test that behavior.)

      Parameters:
      stringsToWrite - - A List of Strings to write to the file
      filePath - - the string containing the path to the file where the tweets should be written
      append - - a boolean indicating whether the new tweets should be appended to the current file or should overwrite its previous contents
    • writeTweetsToFile

      public void writeTweetsToFile(int numTweets, int numChars, String filePath, boolean append)
      Generates tweets and writes them to a file.
      Parameters:
      numTweets - - the number of tweets that should be written
      numChars - - the number of characters in each tweet
      filePath - - the path to a file to write the tweets to
      append - - a boolean indicating whether the new tweets should be appended to the current file or should overwrite its previous contents
    • generateTweet

      public String generateTweet(int numWords)
      Generates a tweet of a given number of words by using the populated MarkovChain. Remember in the writeup where we explained how to use MarkovChain to pick a random starting word and then pick each subsequent word based on the probability that it follows the one before? This is where you implement that core logic!

      Use the (assumed to be trained) MarkovChain as an iterator to build up a String that represents the tweet that's returned.

      1. reset the MarkovChain (to prepare it to generate a new sentence) 2. validate the numWords argument 3. repeatedly generate new words to add to the tweet:

      3.a If the MarkovChain has no more values in its Iterator but the tweet is not yet at the required number of words, use randomPunctuation() to end the sentence and then reset() to begin the next sentence with a random start word.

      Your tweet should be properly formatted with one space between each word and between sentences. It should not contain any leading or trailing whitespace. You should leave the words uncapitalized, just as they are from TweetParser. All tweets should end in punctuation.

      You should return an empty string if there were no sentences available to access when calling hasNext or if the number of words is 0. You also need to do some input validation to make sure the number of words is not negative.

      Parameters:
      numWords - - The desired number of words of the tweet to be produced
      Returns:
      a String representing a generated tweet
      Throws:
      IllegalArgumentException - if numWords is negative
    • generateTweets

      public List<String> generateTweets(int numTweets, int numChars)
      Generates a series of tweets using generateTweetChars().
      Parameters:
      numTweets - - the number of tweets to generate
      numChars - - the number of characters that each generated tweet should have.
      Returns:
      a List of Strings where each element is a tweet
    • generateTweetChars

      public String generateTweetChars(int numChars)
      Generates a tweet using generateTweet().
      Parameters:
      numChars - - The desired number of characters of the tweet to be produced
      Returns:
      a String representing a generated tweet
      Throws:
      IllegalArgumentException - if numChars is negative
    • randomPunctuation

      public String randomPunctuation()
      A helper function for providing a random punctuation String.
      Returns:
      a string containing just one punctuation character, specifically '.' 70% of the time and ';', '?', and '!' each 10% of the time.
    • fixPunctuation

      public int fixPunctuation(char punc)
      A helper function to return the numerical index of the punctuation.
      Parameters:
      punc - - an input char to return the index of
      Returns:
      the numerical index of the punctuation
    • isPunctuation

      public boolean isPunctuation(String s)
      Returns true if the passed in string is punctuation.
      Parameters:
      s - - a string to check whether it's punctuation
      Returns:
      true if the string is punctuation, false otherwise.
    • isPunctuated

      public static boolean isPunctuated(String s)
      A helper function to determine if a string ends in punctuation.
      Parameters:
      s - - an input string to check for punctuation
      Returns:
      true if the string s ends in punctuation
    • main

      public static void main(String[] args)
      Prints ten generated tweets to the console so you can see how your bot is performing!
    • fixDistribution

      public void fixDistribution(List<String> tweet)
      Modifies all MarkovChains to output sentences in the order specified.

      The goal of `fixDistribution` is to ensure that our underlying probability distributions output a tweet in the order that we desire. Our implementation does this by splitting us into 2 LNGs: 1. TwitterBot LNG This LNG serves to make sure our punctuation is output in the order we expect. 2. MarkovChain LNG This LNG makes sure the tweets are output in the proper order. This will be built by running `fixDistribution` on the Markov Chain with punctuation replaced by null.

      Assumes that the expected tweet is punctuated.

      Parameters:
      tweet - - an ordered list of words and punctuation that the MarkovChain should output.