Monday, October 7, 2013

Keyword Density Checker

There are many keyword density checkers on the web. The novice webmasters always use the keyword density checker to get information about their web page content. A good keyword checker helps the webmaster to have pages that conform to SEO (Search Engine Optimization) guidelines so that their web pages will get high ranks from the search engines. Many SEO experts consider the optimum keyword density to be 1 to 3 percent. The percent that is greater than this will be considered as spam and be penalized by the search engines.

In this post, you will learn to create a simple app to be used to check keywords density of a web page from Android devices. I called this project KeywordDensityChecker. You need to create a project in Eclipse and name it KeywordDensityChecker. The KeywordDensityChecker reads content of web page and analyze it to produce a keyord density table that shows keywords, frequencies, and percentages. The frequency of each keyword represents the number of the same keywords found in the web page. Its percentage is calculated by multiplying 100 to the divisional result of the frequency and the total keywords analyzed.
For the user interface of this simple keyword density checker, we need one EditText, one Button, and one ListView. The EditText allows the user to enter a web page address. The Button will be pushed to read the content of the web page and shows the analyzed result on the ListView. These views or components are defined in the activity_main.xml file that is the resource of the Main_Actity class.

activity_main.xm file

<LinearLayout xmlns:android="http://schemas.android.com/apk/res/android"
    xmlns:tools="http://schemas.android.com/tools"
    android:layout_width="match_parent"
    android:layout_height="match_parent"
    android:paddingBottom="@dimen/activity_vertical_margin"
    android:paddingLeft="@dimen/activity_horizontal_margin"
    android:paddingRight="@dimen/activity_horizontal_margin"
    android:paddingTop="@dimen/activity_vertical_margin"
    tools:context=".MainActivity"
    android:orientation="vertical"

     >

  <EditText
        android:id="@+id/txt_input"
        android:layout_width="match_parent"
        android:layout_height="wrap_content"
        android:inputType="text"
        android:hint="Enter web page adress"
        />
     
<Button
          android:id="@+id/bt_extract"
          android:layout_width="match_parent"
          android:layout_height="wrap_content"
          android:text="Check"
          android:onClick="check"
          />

 
        <ListView
            android:id="@+id/density_list"
            android:layout_width="fill_parent"
            android:layout_height="wrap_content"
            android:paddingBottom="5dp"
            android:paddingTop="5dp"
     
            />    

</LinearLayout>


The ListView displays keywords, frequencies, and percentages. Thus, the ListView needs to be customized to show these three items. The layout file (listlayout.xml) of the ListView will have three TextView components defined. One TextView displays the keyword; another one displays the frequency; and the last one displays the percentage.

keyword density checker



The content of the listlayout.xml file is shown below.

listlayout.xml file

<?xml version="1.0" encoding="utf-8"?>

<LinearLayout xmlns:android="http://schemas.android.com/apk/res/android"
    android:layout_width="fill_parent"
    android:layout_height="wrap_content"
    android:orientation="horizontal"
    android:padding="5dip" >

<TextView
    android:id="@+id/keyword"
        android:layout_width="wrap_content"
        android:layout_height="wrap_content"
        android:padding="10sp"
        android:textSize="20sp"
        android:textColor="#0000ff"
        android:textStyle="bold"
       
         >
</TextView>

<TextView
        android:id="@+id/count"
        android:layout_width="wrap_content"
        android:layout_height="wrap_content"
        android:padding="10sp"
        android:textSize="20sp"
        android:textColor="#0000ff"
 
        android:textStyle="bold" >
</TextView>

<TextView
    android:id="@+id/percent"
        android:layout_width="wrap_content"
        android:layout_height="wrap_content"
        android:padding="10sp"
        android:textSize="20sp"
        android:textColor="#0000ff"
   
        android:textStyle="bold" >
</TextView>
</LinearLayout>


Now we take a look at the MainActivty.java file that contains the MainActivity class. This will be the start point of our KeywordDensityChecker app. In this class, the check method is defined. The check method will be called when the Check button is pushed. In this method, we reference to te EditText address and the density_list views. The Checker object is created. All methods are used to read content of the web page, analyze, and prepare the necessary data (keywords, frequencies, and percentages) to construct the keyword density table are defined in the Checker class.

MainActivity.java file

package com.example.keyworddensitychecker;

import android.os.AsyncTask;
import android.os.Bundle;
import android.app.Activity;
import android.app.ProgressDialog;
import android.content.Context;
import android.view.Menu;
import android.view.View;
import android.widget.EditText;
import android.widget.ListView;
import android.widget.Toast;


public class MainActivity extends Activity {

private Context context;
private ListView list;
private Checker checker;
ProgressDialog pd;
    protected void onCreate(Bundle savedInstanceState) {
        super.onCreate(savedInstanceState);
        setContentView(R.layout.activity_main);
        context=this;      
    }
 
    @Override
    protected void onDestroy() {
    if (pd!=null) {
pd.dismiss();

}
    super.onDestroy();
    }

    @Override
    public boolean onCreateOptionsMenu(Menu menu) {
        // Inflate the menu; this adds items to the action bar if it is present.
        getMenuInflater().inflate(R.menu.main, menu);
        return true;
    }
 
 
 
    public void check(View view){
    EditText txtaddress=(EditText)findViewById(R.id.txt_input);
    String address=txtaddress.getText().toString();
    list=(ListView)findViewById(R.id.density_list);
        checker=new Checker();
    if(address.length()>0){
    BackTask bt=new BackTask();
    bt.execute(address,null,null);
    }
    else
    Toast.makeText(context, "Please input web page address", Toast.LENGTH_SHORT).show();
    }
 
    private class BackTask extends AsyncTask<String,Void,Void>{  
   
   
    protected void onPreExecute(){
    super.onPreExecute();
    //show process dialog
    pd = new ProgressDialog(context);
pd.setTitle("Checking...");
pd.setMessage("Please wait.");
pd.setCancelable(true);
pd.setIndeterminate(true);
pd.show();
   
   
    }
    protected Void doInBackground(String...params){    
    try{
    //start analyzing the web page
    checker.processChecking(params[0]);
    }catch(Exception e){
    pd.dismiss();   //close the dialog if error occurs
    }
return null;
   
    }
   
   
   
    protected void onPostExecute(Void result){
    //close the progress dialog
    pd.dismiss();
    //create ListAdapterModel object for the ListView so that
    //the keyword density table can be displayed
    ListAdapterModel lam=new ListAdapterModel(context,R.layout.listlayout,checker.getWords(),checker.getCounts(),checker.getPercents());
    list.setAdapter(lam);
    }
   
    }



 
}


Reading content of the web page, analyzing it, and displaying the result can take long time. So it is a good idea to place these processed in background thread. In the previous post (Web Downloader), you learn to use the IntentService class to put the web downloading process in background thread so that it runs without locking the user interface. In this KeywordDensityCheck, you learn another tool to do the background process. This tool is called AsyncTask. The AsyncTask has three important methods--onPreExecute, doInBackground, and onPostExecute. Int the onPreExecute mehtod, you can write code to do something before the background process starts. In this app, the ProgressDialog is displayed to tell the user to wait.

Progress Dialog to wait while analyzing the web page


In the doInBackground method, you will place code that represents the background process. In this case, the proessChecking method of the Checker class is called to read the content of the web page, analyze it, and prepare the necessary data to construct the keyword density table. After the background process completes you can write code to do something in the onPostExecute method. In our KeywordDensityChecker app, the task to do after the background process completes is closing the progress dialog and constructing the ListAdapterModel object for the ListView component so that the keyword density table can be shown to the user.
The ListView component needs the ArrayAdapter object be its data source. In default, the ListView can display only a single TextView component. In this app, the ListView needs three TextView components. The first step to customize the ListView is to define the three TextView components in the layout file as i mentioned above. The additional step to customize the ArrayAdapter to enable the ListView to show the keyword density table that consists of keywords, frequencies, and percentages. Generally, to customize the ArrayAdapter class, you need to extend it. The ListAdapterModel extends the ArrayAdapter class. The important part of the ArrayAdapter class that must be overridden is the getView method. In this method, code is written to allow the ListView to show the keyword on the keyword TextView, frequency on the count TextView, and the percentage on the percent Textview. Below is the content of the ListAdapterModel class.

ListAdapterModel.java file

package com.example.keyworddensitychecker;

import java.util.ArrayList;
import android.content.Context;
import android.view.LayoutInflater;
import android.view.View;
import android.view.ViewGroup;
import android.widget.ArrayAdapter;
import android.widget.TextView;

public class ListAdapterModel extends ArrayAdapter<String>{
int groupid;
ArrayList<String> words;
Context context;
ArrayList<String> counts;
ArrayList<String> percents;

public ListAdapterModel(Context context, int vg,ArrayList<String> words,ArrayList<String> counts,ArrayList<String> percents){
super(context,vg, words);
this.context=context;
groupid=vg;
this.words=words;
this.counts=counts;
this.percents=percents;
}
public View getView(int position, View convertView, ViewGroup parent) {

        LayoutInflater inflater = (LayoutInflater) context.getSystemService(Context.LAYOUT_INFLATER_SERVICE);
        View itemView = inflater.inflate(groupid, parent, false);
        TextView textKeyword = (TextView) itemView.findViewById(R.id.keyword);
        TextView textCount= (TextView) itemView.findViewById(R.id.count);
        TextView textPercent= (TextView) itemView.findViewById(R.id.percent);
        textKeyword.setText(words.get(position));
        textCount.setText(counts.get(position));
        textPercent.setText(percents.get(position));
        return itemView;
     
}

}


The last code fragment that defines the backgroud process is in the Checker class. In this class, there are four methods. The first method ,readWords will be called to read the content of the web page, split the content in to words. Each word is separated by a single space or many spaces. The split method is used to split the content in to words. A word might contain symbols. The regular expression is used to remove all symbols from the words. The Pattern class is used to define string pattern ("\\W+") to match all symbols (except underscore) and the replaceAll method of the Matcher is called to remove all the symbols from the words. To learn more about regular expression in Java, read this page. The words are filtered again to ignore the words that are not keywords. The filtering task is performed by invoking the addToList method. The addToList method adds every keyword to the LinkedList, keywordsList. The countWords method is invoked to count the frequency of each keyword. This method also adds the every keyword and its frequency to the HashMap, denMap. The prepareTable method prepares data necessary to construct the keyword density table. The keywords will be stored in the ArrayList, keywords; the frequencies are stored in the ArrayList, counts; and the percentages are stored int he ArrayList, percents. The processChecking method wraps the readWords, countWords, and prepareTable methods so that they can be called at once. The getWords, getCounts, and getPercents methods are called from the MainActivity class to return the list of keywords, the list of frequencies, and the list of percents to be used in the ListAdapterModel class. The content of the Checker class is shown below.

Checker.java file

package com.example.keyworddensitychecker;
import java.io.BufferedReader;
import java.io.InputStream;
import java.io.InputStreamReader;
import java.net.HttpURLConnection;
import java.net.URL;
import java.util.ArrayList;
import java.util.Iterator;
import java.util.LinkedList;
import java.util.Set;
import java.util.TreeMap;
import java.util.regex.Matcher;
import java.util.regex.Pattern;


class Checker{
private String address;
private TreeMap<String, Integer> denMap;
private LinkedList<String> keywordsList;
private ArrayList<String> keywords;
private ArrayList<String> counts;
private ArrayList<String> percents;

Checker(){
denMap=new TreeMap<String,Integer>();
keywordsList=new LinkedList<String>();
keywords=new ArrayList<String>();
counts=new ArrayList<String>();
percents=new ArrayList<String>();
}

public void readKeywords(){
Pattern pattern=Pattern.compile("\\W+");
try {
//read the web page content
URL url=new URL(address);
HttpURLConnection con=(HttpURLConnection)url.openConnection();
InputStream is=con.getInputStream();
BufferedReader br = new BufferedReader(new InputStreamReader(is, "UTF-8"));
String strLine;
while((strLine=br.readLine())!=null){
String[] words=strLine.split("[ ]+"); //split content in to words
for(String word:words){
Matcher mat=pattern.matcher(word); //remove all symbols except underscore
word=mat.replaceAll("");
addToList(word.toLowerCase());
}
}

br.close();
} catch (Exception ex) {
// TODO Auto-generated catch block
ex.printStackTrace();
}
}

public void addToList(String word){

//filter non-keywords
//there are more non-keywords that are not filtered in this simple program
//you can add those non-keywords in the nonKeyword array below.
boolean isNonKey=false;
String[] nonKeywords={"a","an","after","and","are","as","at","above","before"
,"below","beyond","br","div","for","in","is","li","of","on","p","the"
,"span","that","this","those","tr","td","to","ul","under","when","where","you"};
for(String nonKey:nonKeywords){
if(nonKey.equals(word)){
isNonKey=true;
break;
}
}
//add only the keyword to the list
if(!isNonKey)
keywordsList.add(word);
}


public void countKeywords(){
int count=1;
String word="";
for(int i=0;i<keywordsList.size();i++){
word=keywordsList.get(i);
for(int j=i+1;j<keywordsList.size();j++){
if(word.equals(keywordsList.get(j))){
count++;
}
}

addToMap(word,count);
count=1;
}

}

public void addToMap(String word, int count){
//place keyword and its frequency in TreeMap
if(!denMap.containsKey(word) && word.length()>=1){
denMap.put(word, count);
}

}


public void prepareTable(){

Set<String> keys=denMap.keySet();
int numWord=keys.size();
Iterator<String> iterator=keys.iterator();
while(iterator.hasNext()){
String word=iterator.next();
int count=denMap.get(word);
int p=100*count/numWord;
if(p>0){
keywords.add(word);
counts.add(String.valueOf(count));
percents.add(p+"%");

}
}



}

public void processChecking(String address){

this.address=address;
readKeywords();
countKeywords();
prepareTable();
}

public ArrayList<String> getWords(){
return keywords;
}
public ArrayList<String> getCounts(){
return counts;
}
public ArrayList<String> getPercents(){
return percents;
}


}


Before starting to run the KeywordDensityChecker app, you need to allow Android to use the internet by placing the below code to the AndroidManifest.xml file.

<uses-permission android:name="android.permission.INTERNET"/>

Download the apk file of the KeywordDensityChecker app

2 comments: