HOWTO create a twitter bot with bash, sed, awk, and curl

From SHellium Wiki
Jump to: navigation, search
Geographylogo.png In other languages: English | Afrikaans | Albanian | Arabic | Brazilian | Bulgarian | Catalan | Chinese | Croatian | Czech | Danish | Dutch | Esperanto | Estonian | Filipino | Finnish | Flemish | French | German | Greek | Hebrew | Hindi | Hungarian | Indonesian | Italian | Japanese | Latvian | Lithuanian | Macedonian | Malay | Malayalam | Norwegian (Bokmål) | Norwegian (Nynorsk) | Persian | Polish | Portuguese | Romanian | Russian | Serbian | Slovak | Slovenian | Spanish | Swedish | Turkish | Ukrainian | Urdu



I signed up for shellium with the intention of providing a better home for my twitter bots.

Currently I've got two reply bots, TweeTzu and mercedes.

Tweetzu provides an I Ching hexagram and judgment to: 1) all who ask @tweeching for a fortune (mention his name in a tweet) 2) a random twitterer who recently used the word "fortune" or "iching" in a tweet, every ten minutes.

Mercedes (Mercedes Porsche Jaguar) is a stripper, and will give you a typical stripper answer every time 1) You send a tweet to @askmpj 2) somebody says "ask a stripper" in a tweet.

I'd like to get more full-featured, providing context sensitive answers for a more convincing AI experience, but for now, this is the result of 1 day's effort.

Tweetzu is kind of kludgy, with some massive redundancies, so I'm not going to include that code, but mercedes is coming along nicely, here is the bash script and associated awk and sed scripts for your perusal.

I've tried to put enough comments in to make it easy to follow, if you have any questions, shoot.

Being unfamiliar as of yet with the shellium environment, I'm making no promises that this will work as currently written. In all likelihood some porting will need to be done.

The Twitter API is simple and elegant, by changing the call you can change the format of the response. I chose to use JSON as it's the preferred format for Twitter aps, and with just a tiny bit of sed, makes for a great spreadsheet for awk to work with. I cut half the work out simply by going to JSON as opposed to Atom.

taojoannes@joannestown ~/mercedes $ crontab -l
# m h dom mon dow command
0,15,30,45 * * * * /home/taojoannes/mercedes/mercedes.bash > /home/taojoannes/mercedes/mercedes.log

taojoannes@joannestown ~/mercedes $ uname -a
Linux joannestown 2.6.28-17-generic #58-Ubuntu SMP Tue Dec 1 18:57:07 UTC 2009 i686 GNU/Linux
taojoannes@joannestown ~/mercedes $ more mercedes.*

::::::::::::::
mercedes.bash
::::::::::::::
# !/bin/bash -x
# This is the brains of Mercedes P Jaguar, stripper.
# Copyright 2010 TaoJoannes http://taojoannes.com
# Exit Statii
# 54 = Data Dirs Not Found
# Set the data directories
export MPJDIR="${HOME}/mercedes"
# Make sure the data directories exist
if [ ! -d ${MPJDIR} ]
then 
	echo "Data Dir not found, exiting"
	exit 54
fi
# Set Log and Temporary Files
export MPJTMP="${MPJDIR}/tmp"
export MPJLOG="${MPJDIR}/log"
export MPJWRK="${MPJDIR}/working"
# Clean Up Old Files
cd ${MPJDIR}
for WORKINGFILE in ${MPJTMP} ${MPJWRK}
do
	if [ -f ${WORKINGFILE} ]
		then
		rm ${WORKINGFILE}
	fi
done
touch ${MPJLOG}
# Set the username and password
export MPJUSER=TWITTERUSERNAME
export MPJPASS=TWITTERPASSWORD
# Get the latest @replies
/usr/bin/curl http://search.twitter.com/search.json?q=%40${MPJUSER} > ${MPJTMP}
# Get the latest "ask a stripper" mentions
/usr/bin/curl http://search.twitter.com/search.json?phrase=ask+a+stripper >> ${MPJTMP}
# Process the file to get it out of JSON format and into a tab-delimited spreadsheet, and see if there are any new replies we need to respond to.
sed -f mercedes.sed ${MPJTMP} | awk -f mercedes.awk > ${MPJWRK}
for TAG in $(awk '{ print $1 }' ${MPJWRK})
do
	if $(grep -q ${TAG} ${MPJLOG})
	then 
		echo X
	else
		# Respond to new replies
                # Get the user name from the working file
		export MPJTRGT=$(grep ${TAG} ${MPJWRK} | awk -F"\t" '{ print $2 }')
		# get mercedes' response
                export FORTUNE="$(/usr/games/fortune askmercedes)" 
		# build a link to the original tweet so we know why she's talking to us
		export STATLINK="http://twitter.com/${MPJTRGT% }/statuses/${TAG}"
		# build the status message
		export STATUS="${FORTUNE} @${MPJTRGT} ${STATLINK}"
		# send the status update using curl
		/usr/bin/curl --basic --user ${MPJUSER}:${MPJPASS} --data status="${STATUS}" http://twitter.com/statuses/update.xml
		# wait a bit so we don't flood
		sleep 5
		# Log that the message was sent, sometimes broken for RTs, must figure this out.
	 	echo ${TAG} >> ${MPJLOG}
	fi
done 
exit 0
::::::::::::::
mercedes.sed - turns the JSON output into a tab-delimited spreadsheet
::::::::::::::
s/[}{]/\
/g
s/":"/\t/g
s/,"/\t/g
s/":/\t/g
s/","/\t/g
s/"/ /g
::::::::::::::
mercedes.awk - pulls out the data we're concerned with working on from the spreadsheet
::::::::::::::
BEGIN { FS = "\t" }
/profile/ { print $12 "\t" $6 }

This is just meant as a fun diversion, a little surprise for the folks of the twitterverse, and something to keep me entertained.

For a query like "ask a stripper", days go by with nobody saying that exact phrase, so I can grab them all, screen out the ones we've responded to, and not worry about flooding the API.

"Fortune" gets about a thousand hits a minute, so for those types of very common topics, I recommend trimming the working file with head to no more than 10-15 entries.

The only other piece to this is the fortune file, which contains all of her responses.

The standard fortune program is used, so all you have to do is create a text file with fortunes separated by % on a line by itself, then run the command:

/usr/bin/strfile original_text_file

Example:

joannestown fortunes # strfile askmercedes 
"askmercedes.dat" created
There were 111 strings
Longest string: 103 bytes
Shortest string: 3 bytes

If the text and dat file are in the current directory, you can just use:

/usr/games/fortune askmercedes

to receive Mercedes' wisdom.

If the files are not in the current directory, fortune looks in it's default location for a matching set of files, and exits with a 1 if it can't find them.

taojoannes@joannestown ~/mercedes $ ls
log mercedes.bash mercedes.log tmp
mercedes.awk mercedesfortunes mercedes.sed working
taojoannes@joannestown ~/mercedes $ strfile mercedesfortunes
"mercedesfortunes.dat" created
There were 111 strings
Longest string: 103 bytes
Shortest string: 3 bytes
taojoannes@joannestown ~/mercedes $ fortune mercedesfortunes
Are you aware of the wonderful plan jesus has for your life?
taojoannes@joannestown ~/mercedes $ cd
taojoannes@joannestown ~ $ fortune mercedesfortunes
No fortunes found
taojoannes@joannestown ~ $ echo $?
1
taojoannes@joannestown ~ $ cd -
/home/taojoannes/mercedes
taojoannes@joannestown ~/mercedes $ fortune mercedesfortunes
Gimme a dollar
taojoannes@joannestown ~/mercedes $ echo $?
0
Personal tools
Namespaces

Variants
Actions
Navigation
Indexes
SHellium Sites
Toolbox