Ubuntu下Coreseek的安裝配置
Ubuntu系統(tǒng)相較于windows系統(tǒng)來說,有許多功能與性能強(qiáng)大的地方。而且ubuntu系統(tǒng)是一個完全免費(fèi)的操作系統(tǒng),所以以后越來越多的用戶會開始使用ubuntu系統(tǒng)。那下面學(xué)學(xué)ubuntu系統(tǒng)下Coreseek的安裝配置吧。
一切按照官方的說明文檔來安裝,但到最好配置時卻老配置老出錯。最終只能再google一下,按下面的配置才算搞定。剛玩coreseek,對一些參數(shù)還不是很熟悉,但又想配置起來玩下,沒有好好地看官方說明文檔呵呵。
防止出現(xiàn)編譯錯誤,先安裝以下程序
- yum -y install mysql mysql-devel php-mysql qt4-mysql python python-dev gcc-c++ gtk+ libtool automake autoconf glibc-common expat-devel
1、安裝
wget http://www.coreseek.cn/uploads/csft/3.1/Source/csft-3.1.tar.gz ####coreseek源文件
wget http://www.coreseek.cn/uploads/csft/3.1/Source/mmseg-3.1.tar.gz #####coreseek所使用的詞典
tar zxvf csft-3.1.tar.gz
tar zxvf mmseg-3.1.tar.gz
#####在安裝coreseek前必須先安裝mmseg
- cd mmseg-3.1
- ./configure –prefix=/usr/local/mmseg
- make
- make install
######## 安裝coreseek ########
##這里不使用python數(shù)據(jù)源,若需要,請加上 –with-python,在mmseg上一定要對應(yīng)路徑
- ./configure –prefix=/usr/local/coreseek –with-mmseg-includes=/usr/local/mmseg/include/mmseg –with-mmseg-libs=/usr/local/mmseg/lib –without-iconv
指定–enable-id64選項會打開64位文檔ID和詞ID的支
- make
- make install
若無問題,安裝完畢后在/usr/local/下生成 coreseek目錄及其下文件。
接下來要生成 mmseg詞庫及配置文件:
cd /usr/loca/mmseg
/usr/local/mmseg/bin/mmseg -u /usr/local/src/mmseg-3.1/data/unigram.txt ###unigram.txt是對應(yīng)的詞典文件,將會生成unigram.txt.uni
cd ../coreseek
mkdir dict ###創(chuàng)建字典目錄
cp /usr/local/src/mmseg-3.1/data/unigram.txt.uni dict/uni.lib ###把創(chuàng)建的詞典復(fù)制到dict
vim dict/mmseg.ini ####創(chuàng)建mmseg的配置文件,此文件在coreseek的windows版本已自帶!
- mmseg.ini:
- [mmseg]
- merge_number_and_ascii=1;
- number_and_ascii_joint=-;
- compress_space=0;
- seperate_number_ascii=1;
至此,mmseg配置完畢!下一步配置csft.conf——coreseek的配置文件
- source article
- {
- type = mysql
- sql_host = localhost
- sql_user = root
- sql_pass = jiaxian
- sql_db = test
- sql_port = 3306 # optional, default is 3306
- sql_query_pre = SET NAMES utf8
- #sql_query_pre = SET SESSION query_cache_type=OFF ##這個可以關(guān)閉sql查詢緩存
- #sql_query = SELECT id, classid, checked, title, newstime, newstext FROM article
- sql_query_range = SELECT MIN(id),MAX(id) FROM article
- sql_range_step = 1000
- sql_query = SELECT id, classid, checked, title, newstime, newstext FROM article WHERE id>=$start AND id<=$end
- sql_attr_uint = classid
- sql_attr_uint = checked
- sql_attr_uint = newstime
- sql_query_info = select * from article where id=$id
- }
- index article
- {
- source = article
- path = /usr/local/coreseek/var/data/article
- docinfo = extern
- charset_type = zh_cn.utf-8 ###指定coreseek的編碼
- charset_dictpath = /usr/local/coreseek/dict #####coreseek字典文件
- min_prefix_len = 0
- min_infix_len = 0
- min_word_len = 2
- ngram_len = 1
- ngram_chars = U+4E00..U+9FBF, U+3400..U+4DBF, U+20000..U+2A6DF, U+F900..U+FAFF,\
- U+2F800..U+2FA1F, U+2E80..U+2EFF, U+2F00..U+2FDF, U+3100..U+312F, U+31A0..U+31BF,\
- U+3040..U+309F, U+30A0..U+30FF, U+31F0..U+31FF, U+AC00..U+D7AF, U+1100..U+11FF,\
- U+3130..U+318F, U+A000..U+A48F, U+A490..U+A4CF
- html_strip = 0
- }
- indexer
- {
- mem_limit = 256M
- }
- searchd
- {
- # address = 0.0.0.0
- log = /usr/local/coreseek/var/log/searchd.log
- query_log = /usr/local/coreseek/var/log/query.log
- read_timeout = 5
- max_children = 30
- pid_file = /usr/local/coreseek/var/log/searchd.pid
- max_matches = 1000
- seamless_rotate = 1
- }
#p#
表的結(jié)構(gòu)
`
- article`
- DROP TABLE IF EXISTS `article`;
- CREATE TABLE IF NOT EXISTS `article` (
- `id` int(11) NOT NULL AUTO_INCREMENT,
- `classid` smallint(6) NOT NULL DEFAULT ’0′,
- `checked` tinyint(1) NOT NULL DEFAULT ’0′,
- `title` varchar(200) NOT NULL DEFAULT ”,
- `newstime` int(10) NOT NULL DEFAULT ’0′,
- `newstext` mediumtext NOT NULL,
- PRIMARY KEY (`id`),
- KEY `checked` (`checked`),
- KEY `newstime` (`newstime`),
- KEY `classid` (`classid`)
- ) ENGINE=MyISAM DEFAULT CHARSET=utf8;
- INSERT INTO `article` (`id`, `classid`, `checked`, `title`, `newstime`, `newstext`) SELECT `id`, `classid`, `checked`, `title`, `newstime`, `newstext` FROM `test` where id < 1000
建立索引:
- /usr/local/coreseek/bin/indexer –config /usr/local/coreseek/dict/csft.conf –all –rotate
使用CLI端測試一下:
- /usr/local/coreseek/bin/search -c /usr/local/coreseek/dict/cnal.conf -i url_quick 鋁
啟動Sphinx守護(hù)進(jìn)程(searchd)
- /usr/local/coreseek/bin/searchd -c /usr/local/coreseek/dict/csft.conf
- /usr/local/coreseek/bin/searchd –stop -c /usr/local/coreseek/dict/csft.conf
2、出錯
CentOS 編譯sphinx時老出現(xiàn)xmlUnknownEncoding 錯誤
- libsphinx.a(sphinx.o): In function `xmlUnknownEncoding’:
- /var/nfs_root/csft-3.1/src/sphinx.cpp:19072: undefined reference to `libiconv_open’
- /var/nfs_root/csft-3.1/src/sphinx.cpp:19090: undefined reference to `libiconv’
- /var/nfs_root/csft-3.1/src/sphinx.cpp:19096: undefined reference to `libiconv_close’
- libsphinx.a(tokenizer_zhcn.o): In function `CSphTokenizer_zh_CN_GBK::GetLocalBuffer(unsigned char*, int, unsigned char*)’:
- /var/nfs_root/csft-3.1/src/tokenizer_zhcn.cpp:327: undefined reference to `libiconv’
- libsphinx.a(tokenizer_zhcn.o): In function `CSphTokenizer_zh_CN_UTF8_Private::GetConverterOutput(char const*, char const*)’:
- /var/nfs_root/csft-3.1/src/tokenizer_zhcn.cpp:79: undefined reference to `libiconv_open’
- /var/nfs_root/csft-3.1/src/tokenizer_zhcn.cpp:82: undefined reference to `libiconv’
- libsphinx.a(tokenizer_zhcn.o): In function `CSphTokenizer_zh_CN_GBK::SetBuffer(unsigned char*, int)’:
- /var/nfs_root/csft-3.1/src/tokenizer_zhcn.cpp:355: undefined reference to `libiconv’
- libsphinx.a(tokenizer_zhcn.o): In function `CSphTokenizer_zh_CN_UTF8_Private::GetConverter(char const*, char const*)’:
- /var/nfs_root/csft-3.1/src/tokenizer_zhcn.cpp:63: undefined reference to `libiconv_open’
- /var/nfs_root/csft-3.1/src/tokenizer_zhcn.cpp:66: undefined reference to `libiconv’
- libsphinx.a(tokenizer_zhcn.o): In function `~CSphTokenizer_zh_CN_UTF8_Private’:
- /var/nfs_root/csft-3.1/src/tokenizer_zhcn.cpp:36: undefined reference to `libiconv_close’
- /var/nfs_root/csft-3.1/src/tokenizer_zhcn.cpp:38: undefined reference to `libiconv_close’
- /var/nfs_root/csft-3.1/src/tokenizer_zhcn.cpp:36: undefined reference to `libiconv_close’
- /var/nfs_root/csft-3.1/src/tokenizer_zhcn.cpp:38: undefined reference to `libiconv_close’
- /var/nfs_root/csft-3.1/src/tokenizer_zhcn.cpp:36: undefined reference to `libiconv_close’
- libsphinx.a(tokenizer_zhcn.o):/var/nfs_root/csft-3.1/src/tokenizer_zhcn.cpp:38: more undefined references to `libiconv_close’ follow
- collect2: ld returned 1 exit status
- make[2]: *** [indexer] Error 1
- make[2]: Leaving directory `/var/nfs_root/csft-3.1/src’
- make[1]: *** [all] Error 2
- make[1]: Leaving directory `/var/nfs_root/csft-3.1/src’
- make: *** [all-recursive] Error 1
- 處理結(jié)果:
- Add ‘-liconv’ to LIBS in src/Makefile
- from
- LIBS = -lm -lexpat -L/usr/local/lib
- to
- LIBS = -lm -lexpat -liconv -L/usr/local/lib
總結(jié):
希望本文介紹的Ubuntu下Coreseek的安裝配置的過程能夠?qū)ψx者有所幫助,更多有關(guān)linux系統(tǒng)的知識還有待于讀者去探索和學(xué)習(xí)。
【編輯推薦】