<rt id="bn8ez"></rt>
<label id="bn8ez"></label>

  • <span id="bn8ez"></span>

    <label id="bn8ez"><meter id="bn8ez"></meter></label>

    qileilove

    blog已經(jīng)轉(zhuǎn)移至github,大家請(qǐng)?jiān)L問(wèn) http://qaseven.github.io/

    網(wǎng)頁(yè)主動(dòng)探測(cè)工具使用

     單位的項(xiàng)目是IBatis做的,每個(gè)查詢(xún)的SQL里面都有很多判斷
      上次優(yōu)化SQL之后,其中的一個(gè)分支報(bào)錯(cuò),但是作為dba,不可能排查每一個(gè)分支.
      所以,干脆用爬蟲(chóng)爬過(guò)所有的網(wǎng)頁(yè),主動(dòng)探測(cè)程序的異常.
      這樣有兩個(gè)好處
      1.可以主動(dòng)查看網(wǎng)頁(yè)是否異常 (500錯(cuò)誤,404錯(cuò)誤)
      2.可以篩查速度較慢的網(wǎng)頁(yè),從這個(gè)方向也可以定位慢SQL吧.(也有服務(wù)器資源不足,造成網(wǎng)絡(luò)超時(shí)的情況)
      前提,
      必須是互聯(lián)網(wǎng)公司,大多數(shù)網(wǎng)頁(yè)不用登錄也可以瀏覽
      首先,建表
      CREATE SEQUENCE seq_probe_id INCREMENT BY 1 START WITH 1 NOMAXvalue NOCYCLE CACHE 2000;
      create table probe(
      id int primary key,
      host varchar(40) not null,
      path varchar(500) not null,
      state int not null,
      taskTime int not null,
      type varchar(10) not null,
      createtime date default sysdate not null
      ) ;
      其中host是域名,path是網(wǎng)頁(yè)的相對(duì)路徑,state是HTTP狀態(tài)碼,taskTime是網(wǎng)頁(yè)獲取時(shí)間,單位是毫秒,type是類(lèi)型(html,htm,jpg等)
      程序結(jié)構(gòu)
      程序分三個(gè)主要步驟,再分別用三個(gè)隊(duì)列實(shí)現(xiàn)生產(chǎn)者消費(fèi)者模式.
      1.連接.根據(jù)連接隊(duì)列的目標(biāo),使用Socket獲取網(wǎng)頁(yè),然后放入解析隊(duì)列
      2.解析.根據(jù)解析隊(duì)列的內(nèi)容,使用正則表達(dá)式獲取該網(wǎng)頁(yè)的合法連接,將其再放入連接隊(duì)列.然后將解析的網(wǎng)頁(yè)放入持久化隊(duì)列
      3.持久化.將持久化隊(duì)列的內(nèi)容存入數(shù)據(jù)庫(kù),以便查詢(xún)。
      程序使用三個(gè)步驟并行,每個(gè)步驟可以并發(fā)的方式.
    但是通常來(lái)說(shuō),解析和持久化可以分別用單線程的方式執(zhí)行.
    import java.io.BufferedReader;
    import java.io.BufferedWriter;
    import java.io.InputStreamReader;
    import java.io.OutputStreamWriter;
    import java.net.InetAddress;
    import java.net.Socket;
    import java.sql.Connection;
    import java.sql.DriverManager;
    import java.sql.PreparedStatement;
    import java.sql.SQLException;
    import java.util.ArrayList;
    import java.util.Iterator;
    import java.util.List;
    import java.util.Set;
    import java.util.concurrent.BlockingQueue;
    import java.util.concurrent.ConcurrentSkipListSet;
    import java.util.concurrent.CopyOnWriteArrayList;
    import java.util.concurrent.ExecutorService;
    import java.util.concurrent.Executors;
    import java.util.concurrent.LinkedBlockingQueue;
    import java.util.concurrent.atomic.AtomicInteger;
    import java.util.regex.Matcher;
    import java.util.regex.Pattern;
    public class Probe {
    private static final BlockingQueue<Task> CONNECTLIST = new LinkedBlockingQueue<Task>();
    private static final BlockingQueue<Task> PARSELIST = new LinkedBlockingQueue<Task>();
    private static final BlockingQueue<Task> PERSISTENCELIST = new LinkedBlockingQueue<Task>();
    private static ExecutorService CONNECTTHREADPOOL;
    private static ExecutorService PARSETHREADPOOL;
    private static ExecutorService PERSISTENCETHREADPOOL;
    private static final List<String> DOMAINLIST = new CopyOnWriteArrayList<>();
    static {
    CONNECTTHREADPOOL = Executors.newFixedThreadPool(200);
    PARSETHREADPOOL = Executors.newSingleThreadExecutor();
    PERSISTENCETHREADPOOL = Executors.newFixedThreadPool(1);
    DOMAINLIST.add("域名");
    }
    public static void main(String args[]) throws Exception {
    long start = System.currentTimeMillis();
    CONNECTLIST.put(new Task("域名", 80, "/static/index.html"));
    for (int i = 0; i < 600; i++) {
    CONNECTTHREADPOOL.submit(new ConnectHandler(CONNECTLIST, PARSELIST));
    }
    PARSETHREADPOOL.submit(new ParseHandler(CONNECTLIST, PARSELIST, PERSISTENCELIST, DOMAINLIST));
    PERSISTENCETHREADPOOL.submit(new PersistenceHandler(PERSISTENCELIST));
    while (true) {
    Thread.sleep(1000);
    long end = System.currentTimeMillis();
    float interval = ((end - start) / 1000);
    int connectTotal = ConnectHandler.GETCOUNT();
    int parseTotal = ParseHandler.GETCOUNT();
    int persistenceTotal = PersistenceHandler.GETCOUNT();
    int connectps = Math.round(connectTotal / interval);
    int parseps = Math.round(parseTotal / interval);
    int persistenceps = Math.round(persistenceTotal / interval);
    System.out.print("\r連接總數(shù):" + connectTotal + " \t每秒連接:" + connectps + "\t連接隊(duì)列剩余:" + CONNECTLIST.size()
    + " \t解析總數(shù):" + parseTotal + " \t每秒解析:" + parseps + "\t解析隊(duì)列剩余:" + PARSELIST.size() + " \t持久化總數(shù):"
    + persistenceTotal + " \t每秒持久化:" + persistenceps + "\t持久化隊(duì)列剩余:" + PERSISTENCELIST.size());
    }
    }
    }
    class Task {
    public Task() {
    }
    public void init(String host, int port, String path) {
    this.setCurrentPath(path);
    this.host = host;
    this.port = port;
    }
    public Task(String host, int port, String path) {
    init(host, port, path);
    }
    private String host;
    private int port;
    private String currentPath;
    private long taskTime;
    private String type;
    private String content;
    private int state;
    public int getState() {
    return state;
    }
    public void setState(int state) {
    this.state = state;
    }
    public String getCurrentPath() {
    return currentPath;
    }
    public void setCurrentPath(String currentPath) {
    this.currentPath = currentPath;
    this.type = currentPath.substring(currentPath.indexOf(".") + 1,
    currentPath.indexOf("?") != -1 ? currentPath.indexOf("?") : currentPath.length());
    }
    public long getTaskTime() {
    return taskTime;
    }
    public void setTaskTime(long taskTime) {
    this.taskTime = taskTime;
    }
    public String getType() {
    return type;
    }
    public void setType(String type) {
    this.type = type;
    }
    public String getHost() {
    return host;
    }
    public int getPort() {
    return port;
    }
    public String getContent() {
    return content;
    }
    public void setContent(String content) {
    this.content = content;
    }
    }
    class ParseHandler implements Runnable {
    private static Set<String> SET = new ConcurrentSkipListSet<String>();
    public static int GETCOUNT() {
    return COUNT.get();
    }
    private static final AtomicInteger COUNT = new AtomicInteger();
    private BlockingQueue<Task> connectlist;
    private BlockingQueue<Task> parselist;
    private BlockingQueue<Task> persistencelist;
    List<String> domainlist;
    private interface Filter {
    void doFilter(Task fatherTask, Task newTask, String path, Filter chain);
    }
    private class FilterChain implements Filter {
    private List<Filter> list = new ArrayList<Filter>();
    {
    addFilter(new TwoLevel());
    addFilter(new OneLevel());
    addFilter(new FullPath());
    addFilter(new Root());
    addFilter(new Default());
    }
    private void addFilter(Filter filter) {
    list.add(filter);
    }
    private Iterator<Filter> it = list.iterator();
    @Override
    public void doFilter(Task fatherTask, Task newTask, String path, Filter chain) {
    if (it.hasNext()) {
    it.next().doFilter(fatherTask, newTask, path, chain);
    }
    }
    }
    private class TwoLevel implements Filter {
    @Override
    public void doFilter(Task fatherTask, Task newTask, String path, Filter chain) {
    if (path.startsWith("../../")) {
    String prefix = getPrefix(fatherTask.getCurrentPath(), 3);
    newTask.init(fatherTask.getHost(), fatherTask.getPort(), path.replace("../../", prefix));
    } else {
    chain.doFilter(fatherTask, newTask, path, chain);
    }
    }
    }
    private class OneLevel implements Filter {
    @Override
    public void doFilter(Task fatherTask, Task newTask, String path, Filter chain) {
    if (path.startsWith("../")) {
    String prefix = getPrefix(fatherTask.getCurrentPath(), 2);
    newTask.init(fatherTask.getHost(), fatherTask.getPort(), path.replace("../", prefix));
    } else {
    chain.doFilter(fatherTask, newTask, path, chain);
    }
    }
    }
    private class FullPath implements Filter {
    @Override
    public void doFilter(Task fatherTask, Task newTask, String path, Filter chain) {
    if (path.startsWith("http://")) {
    Iterator<String> it = domainlist.iterator();
    boolean flag = false;
    while (it.hasNext()) {
    String domain = it.next();
    if (path.startsWith("http://" + domain + "/")) {
    newTask.init(domain, fatherTask.getPort(), path.replace("http://" + domain + "/", "/"));
    flag = true;
    break;
    }
    }
    if (!flag) {
    newTask = null;
    }
    } else {
    chain.doFilter(fatherTask, newTask, path, chain);
    }
    }
    }
    private class Root implements Filter {
    @Override
    public void doFilter(Task fatherTask, Task newTask, String path, Filter chain) {
    if (path.startsWith("/")) {
    newTask.init(fatherTask.getHost(), fatherTask.getPort(), path);
    } else {
    chain.doFilter(fatherTask, newTask, path, chain);
    }
    }
    }
    private class Default implements Filter {
    @Override
    public void doFilter(Task fatherTask, Task newTask, String path, Filter chain) {
    String prefix = getPrefix(fatherTask.getCurrentPath(), 1);
    newTask.init(fatherTask.getHost(), fatherTask.getPort(), prefix + "/" + path);
    }
    }
    public ParseHandler(BlockingQueue<Task> connectlist, BlockingQueue<Task> parselist,
    BlockingQueue<Task> persistencelist, List<String> domainlist) {
    this.connectlist = connectlist;
    this.parselist = parselist;
    this.persistencelist = persistencelist;
    this.domainlist = domainlist;
    }
    private Pattern pattern = Pattern.compile("\"[^\"]+\\.htm[^\"]*\"");
    private void handler() {
    try {
    Task task = parselist.take();
    parseTaskState(task);
    if (200 == task.getState()) {
    Matcher matcher = pattern.matcher(task.getContent());
    while (matcher.find()) {
    String path = matcher.group();
    if (!path.contains(" ") && !path.contains("\t") && !path.contains("(") && !path.contains(")")
    && !path.contains(":")) {
    path = path.substring(1, path.length() - 1);
    if (!SET.contains(path)) {
    SET.add(path);
    createNewTask(task, path);
    }
    }
    }
    }
    task.setContent(null);
    persistencelist.put(task);
    } catch (Exception e) {
    // TODO Auto-generated catch block
    e.printStackTrace();
    }
    }
    private void parseTaskState(Task task) {
    if (task.getContent().startsWith("HTTP/1.1")) {
    task.setState(Integer.parseInt(task.getContent().substring(9, 12)));
    } else {
    task.setState(Integer.parseInt(task.getContent().substring(19, 22)));
    }
    }
    /**
    * @param fatherTask
    * @param path
    * @throws Exception
    */
    private void createNewTask(Task fatherTask, String path) throws Exception {
    Task newTask = new Task();
    FilterChain filterchain = new FilterChain();
    filterchain.doFilter(fatherTask, newTask, path, filterchain);
    if (newTask != null) {
    connectlist.put(newTask);
    }
    }
    private String getPrefix(String s, int count) {
    String prefix = s;
    while (count > 0) {
    prefix = prefix.substring(0, prefix.lastIndexOf("/"));
    count--;
    }
    return "".equals(prefix) ? "/" : prefix;
    }
    @Override
    public void run() {
    while (true) {
    this.handler();
    COUNT.addAndGet(1);
    }
    }
    }
    class ConnectHandler implements Runnable {
    public static int GETCOUNT() {
    return COUNT.get();
    }
    private static final AtomicInteger COUNT = new AtomicInteger();
    private BlockingQueue<Task> connectlist;
    private BlockingQueue<Task> parselist;
    public ConnectHandler(BlockingQueue<Task> connectlist, BlockingQueue<Task> parselist) {
    this.connectlist = connectlist;
    this.parselist = parselist;
    }
    private void handler() {
    try {
    Task task = connectlist.take();
    long start = System.currentTimeMillis();
    getHtml(task);
    long end = System.currentTimeMillis();
    task.setTaskTime(end - start);
    parselist.put(task);
    } catch (Exception e) {
    // TODO Auto-generated catch block
    e.printStackTrace();
    }
    }
    private void getHtml(Task task) throws Exception {
    StringBuilder sb = new StringBuilder(2048);
    InetAddress addr = InetAddress.getByName(task.getHost());
    // 建立一個(gè)Socket
    Socket socket = new Socket(addr, task.getPort());
    // 發(fā)送命令,無(wú)非就是在Socket發(fā)送流的基礎(chǔ)上加多一些握手信息,詳情請(qǐng)了解HTTP協(xié)議
    BufferedWriter wr = new BufferedWriter(new OutputStreamWriter(socket.getOutputStream(), "UTF-8"));
    wr.write("GET " + task.getCurrentPath() + " HTTP/1.0\r\n");
    wr.write("HOST:" + task.getHost() + "\r\n");
    wr.write("Accept:*/*\r\n");
    wr.write("\r\n");
    wr.flush();
    // 接收Socket返回的結(jié)果,并打印出來(lái)
    BufferedReader rd = new BufferedReader(new InputStreamReader(socket.getInputStream()));
    String line;
    while ((line = rd.readLine()) != null) {
    sb.append(line);
    }
    wr.close();
    rd.close();
    task.setContent(sb.toString());
    socket.close();
    }
    @Override
    public void run() {
    while (true) {
    this.handler();
    COUNT.addAndGet(1);
    }
    }
    }
    class PersistenceHandler implements Runnable {
    static {
    try {
    Class.forName("oracle.jdbc.OracleDriver");
    } catch (ClassNotFoundException e) {
    // TODO Auto-generated catch block
    e.printStackTrace();
    }
    }
    public static int GETCOUNT() {
    return COUNT.get();
    }
    private static final AtomicInteger COUNT = new AtomicInteger();
    private BlockingQueue<Task> persistencelist;
    public PersistenceHandler(BlockingQueue<Task> persistencelist) {
    this.persistencelist = persistencelist;
    try {
    conn = DriverManager.getConnection("jdbc:oracle:thin:127.0.0.1:1521:orcl", "edmond", "edmond");
    ps = conn
    .prepareStatement("insert into probe(id,host,path,state,tasktime,type) values(seq_probe_id.nextval,?,?,?,?,?)");
    } catch (SQLException e) {
    // TODO Auto-generated catch block
    e.printStackTrace();
    }
    }
    private Connection conn;
    private PreparedStatement ps;
    @Override
    public void run() {
    while (true) {
    this.handler();
    COUNT.addAndGet(1);
    }
    }
    private void handler() {
    try {
    Task task = persistencelist.take();
    ps.setString(1, task.getHost());
    ps.setString(2, task.getCurrentPath());
    ps.setInt(3, task.getState());
    ps.setLong(4, task.getTaskTime());
    ps.setString(5, task.getType());
    ps.executeUpdate();
    conn.commit();
    } catch (InterruptedException e) {
    e.printStackTrace();
    } catch (SQLException e) {
    e.printStackTrace();
    }
    }
    }
      ParseHandler 使用了一個(gè)職責(zé)鏈模式,
      TwoLevel 處理../../開(kāi)頭的連接(../../sucai/sucai.htm)
      OneLevel 處理../開(kāi)頭的連接(../sucai/sucai.htm)
      FullPath 處理絕對(duì)路徑的連接(http://域名/sucai/sucai.htm)
      Root 處理/開(kāi)頭的連接(/sucai/sucai.htm)
      Default 處理常規(guī)的連接(sucai.htm)
      ParseHandler FullPath 過(guò)濾需要一個(gè)白名單.
      這樣可以使程序在固定的域名爬行
      ParseHandler parseTaskState 解析狀態(tài)碼 可能需要根據(jù)實(shí)際情況進(jìn)行調(diào)整
      比如網(wǎng)頁(yè)404,服務(wù)器可能會(huì)返回一個(gè)錯(cuò)誤頁(yè),而不是通常的HTTP狀態(tài)碼。
      第一版僅僅實(shí)現(xiàn)了功能,錯(cuò)誤處理不完整,
      所以?xún)H僅在定制的域名下生效,其實(shí)并不通用,后續(xù)會(huì)逐步完善.

    posted on 2014-12-03 13:43 順其自然EVO 閱讀(205) 評(píng)論(0)  編輯  收藏 所屬分類(lèi): 測(cè)試學(xué)習(xí)專(zhuān)欄

    <2014年12月>
    30123456
    78910111213
    14151617181920
    21222324252627
    28293031123
    45678910

    導(dǎo)航

    統(tǒng)計(jì)

    常用鏈接

    留言簿(55)

    隨筆分類(lèi)

    隨筆檔案

    文章分類(lèi)

    文章檔案

    搜索

    最新評(píng)論

    閱讀排行榜

    評(píng)論排行榜

    主站蜘蛛池模板: 无码不卡亚洲成?人片| 1a级毛片免费观看| gogo全球高清大胆亚洲| 亚洲中文字幕无码爆乳| 91在线老王精品免费播放| 亚洲精品视频专区| 免费观看激色视频网站bd | 日本视频免费观看| 可以免费观看的一级毛片| 国产精品亚洲精品爽爽| 四虎影视永久免费观看网址| 国产精品自拍亚洲| 亚洲一区精品伊人久久伊人| jizz免费在线影视观看网站| 久久久久亚洲AV成人网人人软件| 国产成人无码精品久久久免费| 亚洲精品国精品久久99热一| 国产亚洲精品免费视频播放| 亚洲精品免费在线观看| 91网站免费观看| 亚洲AV无码AV吞精久久| 亚洲高清免费视频| 久久久久久免费一区二区三区| 亚洲AV无码国产精品色午友在线| 18女人水真多免费高清毛片| 中文字幕亚洲精品无码| 国产自产拍精品视频免费看| 精品一区二区三区无码免费直播| 亚洲一区二区三区香蕉| 精品熟女少妇a∨免费久久| 成人亚洲国产va天堂| 亚洲国产电影av在线网址| 日本在线看片免费| 亚洲人成在线精品| 国产jizzjizz免费看jizz| 99精品免费视品| 亚洲三级在线免费观看| 亚洲国产精品综合久久网络| 免费人成在线观看网站| 亚洲狠狠色丁香婷婷综合| 在线播放亚洲第一字幕|