對種子文件進行解析需要先了解一下種子文件的格式, 然後根據格式進行對應的解析.
一. BT種子文件格式
這裏只是簡單的介紹一下, 具體信息可以自行google.
BT種子文件使用了一種叫bencoding的編碼方法來保存數據。
bencoding現有四種類型的數據:srings(字符串),integers(整數),lists(列表),dictionaries(字典)
字符串:
字符串被如此編碼:<字符串長度>:字符串正文.這種表示法沒有任何的分界符.
例子:如”8:announce”指”announce”.
整數:
整數被如此編碼: i整數值e. 可以爲負數,如’i-3e’
例子:’i3e’ 指 3.
列表:
列表是如此被表示的:
<l>Bencode Value<e>
列表可以用來表示多個對象.
列表內容可以包括字符串,整數,字典,甚至列表本身.
例子:’l4:spam4:eggse’ 指 [ “spam”, eggs” ]
字典:
字典是一個一對一的映射.它表示了一個主鍵(必須爲字符串)和一個數據項(可以爲任何Bencode值)的關係.字典可以用來表示一個對象的多種屬性.
字典是如此被編碼:
<d><bencoded string><bencoded element><e>
注意:字典必須根據主鍵預排序.
二. 對BT文件進行解析
1.首先創建一個對象BtValue來保存bt文件的四種類型的數據
public class BtValue {
//mValue可以是String, int, list或Map
private final Object mValue;
public BtValue(byte[] mValue) {
this.mValue = mValue;
}
public BtValue(String mValue) throws UnsupportedEncodingException {
this.mValue = mValue.getBytes("UTF-8");
}
public BtValue(String mValue, String enc) throws UnsupportedEncodingException {
this.mValue = mValue.getBytes(enc);
}
public BtValue(int mValue) {
this.mValue = mValue;
}
public BtValue(long mValue) {
this.mValue = mValue;
}
public BtValue(Number mValue) {
this.mValue = mValue;
}
public BtValue(List<BtValue> mValue) {
this.mValue = mValue;
}
public BtValue(Map<String, BtValue> mValue) {
this.mValue = mValue;
}
public Object getValue() {
return this.mValue;
}
/**
* 將BtValue作爲String返回, 使用UTF-8進行編碼
*/
public String getString() throws InvalidBtEncodingException {
return getString("UTF-8");
}
public String getString(String encoding) throws InvalidBtEncodingException {
try {
return new String(getBytes(), encoding);
} catch (ClassCastException e) {
throw new InvalidBtEncodingException(e.toString());
} catch (UnsupportedEncodingException e) {
throw new InternalError(e.toString());
}
}
/**
* 將Btvalue對象作爲byte[]數組返回
*/
public byte[] getBytes() throws InvalidBtEncodingException {
try {
return (byte[])mValue;
} catch (ClassCastException e) {
throw new InvalidBtEncodingException(e.toString());
}
}
/**
* 將BtValue對象作爲數字返回
*/
public Number getNumber() throws InvalidBtEncodingException {
try {
return (Number)mValue;
} catch (ClassCastException e) {
throw new InvalidBtEncodingException(e.toString());
}
}
/**
* 將BtValue對象作爲short返回
*/
public short getShort() throws InvalidBtEncodingException {
return getNumber().shortValue();
}
/**
* 將BtValue對象作爲int返回
*/
public int getInt() throws InvalidBtEncodingException {
return getNumber().intValue();
}
/**
* 將BtValue對象作爲long返回
*/
public long getLong() throws InvalidBtEncodingException {
return getNumber().longValue();
}
/**
* 將BtValue對象作爲List返回
*/
@SuppressWarnings("unchecked")
public List<BtValue> getList() throws InvalidBtEncodingException {
if (mValue instanceof ArrayList) {
return (ArrayList<BtValue>)mValue;
} else {
throw new InvalidBtEncodingException("Excepted List<BtValue> !");
}
}
/**
* 將BtValue對象作爲Map返回
*/
@SuppressWarnings("unchecked")
public Map<String, BtValue> getMap() throws InvalidBtEncodingException {
if (mValue instanceof HashMap) {
return (Map<String, BtValue>)mValue;
} else {
throw new InvalidBtEncodingException("Expected Map<String, BtValue> !");
}
}
爲了更好的管理異常, 在這裏自定義 異常InvalidBtEncodingException來統一管理
public class InvalidBtEncodingException extends IOException {
public static final long serialVersionUID = -1;
public InvalidBtEncodingException(String message) {
super(message);
}
}
2.編寫解析類BtParser, 這裏採用的方法爲遞歸回溯, 根據bt文件的特有格式分四種類型進行解析, 將解析內容保存爲BtValue對象.
public class BtParser {
private final InputStream mInput;
// Zero 未知類型
// '0'..'9' 表示是byte[]數組也就是字符串類型.
// 'i' 表示是數字數字.
// 'l' 表示是列表類型.
// 'd' 表示是字典類型
// 'e' 表示是數字,列表或字典的結束字符
// -1 表示讀取到流的結尾
// 調用getNextIndicator接口獲取當前的值
private int mIndicator = 0;
private BtParser(InputStream in) {
mInput = in;
}
public static BtValue btDecode(InputStream in) throws IOException {
return new BtParser(in).btParse();
}
private BtValue btParse() throws IOException{
if (getNextIndicator() == -1)
return null;
if (mIndicator >= '0' && mIndicator <= '9')
return btParseBytes(); //read string
else if (mIndicator == 'i')
return btParseNumber(); // read integer
else if (mIndicator == 'l')
return btParseList(); // read list
else if (mIndicator == 'd')
return btParseMap(); // read Map
else
throw new InvalidBtEncodingException
("Unknown indicator '" + mIndicator + "'");
}
/**
* 對應解析bt文件的字符串類型
* 1. 解析字符串的長度
* 2. 根據解析的長度從輸入流中讀取指定長度的字符
* 3. 根據讀取到的字符數組構建BtValue對象
* 對應bt文件的 4:ptgh 字符串格式
*/
private BtValue btParseBytes() throws IOException{
int b = getNextIndicator();
int num = b - '0';
if (num < 0 || num > 9) {
throw new InvalidBtEncodingException("parse bytes(String) error: not '"
+ (char)b +"'");
}
mIndicator = 0;
b = read();
int i = b - '0';
while (i >= 0 && i <= 9) {
num = num * 10 + i;
b = read();
i = b - '0';
}
if (b != ':') {
throw new InvalidBtEncodingException("Colon error: not '" +
(char)b + "'");
}
return new BtValue(read(num));
}
/**
* 對應解析bt文件中的數字類型
* 1. 判斷是否是以 i 字符開頭
* 2. 判斷要解析的數字是否爲負數
* 3. 讀取數字到chars數組中直到遇見字符e
* 4. 有chars數組生成數字, 並生成BtValue對象
* 對應bt文件的 i5242e 數字格式
*/
private BtValue btParseNumber() throws IOException{
int b = getNextIndicator();
if (b != 'i') {
throw new InvalidBtEncodingException("parse number error: not '" +
(char)b + "'");
}
mIndicator = 0;
b = read();
if (b == '0') {
b = read();
if (b == 'e') {
return new BtValue(BigInteger.ZERO);
} else {
throw new InvalidBtEncodingException("'e' expected after zero," +
" not '" + (char)b + "'");
}
}
// don't support more than 255 char big integers
char[] chars = new char[255];
int offset = 0;
// to determine whether the number is negative
if (b == '-') {
b = read();
if (b == '0') {
throw new InvalidBtEncodingException("Negative zero not allowed");
}
chars[offset] = '-';
offset++;
}
if (b < '1' || b > '9') {
throw new InvalidBtEncodingException("Invalid Integer start '"
+ (char)b + "'");
}
chars[offset] = (char)b;
offset++;
// start read the number, save in chars
b = read();
int i = b - '0';
while (i >= 0 && i <= 9) {
chars[offset] = (char)b;
offset++;
b = read();
i = b - '0';
}
if (b != 'e') {
throw new InvalidBtEncodingException("Integer should end with 'e'");
}
String s = new String(chars, 0, offset);
return new BtValue(new BigInteger(s));
}
/**
* 對應解析bt文件中的列表類型
* 1. 判斷是否是以'l'字符開頭
* 2. 調用btParse解析出BtValue對象, 添加到list中, 直到遇見'e'字符
* 3. 使用獲得的list對象構造BtValue對象(這時代表了list)
* 對應bt文件的 l4:spam4:tease 格式
* 如果是 l4:spam4:tease 那麼 list對象包含兩個BtValue對象, 分別爲 spam 和 tease 字符串
*/
private BtValue btParseList() throws IOException{
int b = getNextIndicator();
if (b != 'l') {
throw new InvalidBtEncodingException("Expected 'l', not '" +
(char)b + "'");
}
mIndicator = 0;
List<BtValue> result = new ArrayList<>();
b = getNextIndicator();
while (b != 'e') {
result.add(btParse());
b = getNextIndicator();
}
mIndicator = 0;
return new BtValue(result);
}
/**
* 對應解析bt文件中的字典類型
* 1. 判斷是否是以'd'字符開頭
* 2. 調用btParse解析獲得key與value, 添加到Map中, 直到遇見'e'字符
* 3. 使用獲得的Map對象構造BtValue對象(這時代表了Map)
* 對應bt文件的 <d> <key String> <value content> <e>格式
*/
private BtValue btParseMap() throws IOException{
int b = getNextIndicator();
if (b != 'd') {
throw new InvalidBtEncodingException("Expected 'd', not '" +
(char)b + "'");
}
mIndicator = 0;
Map<String, BtValue> result = new HashMap<>();
b = getNextIndicator();
while (b != 'e') {
// Dictionary keys are always strings
String key = btParse().getString();
BtValue value = btParse();
result.put(key, value);
b = getNextIndicator();
}
mIndicator = 0;
return new BtValue(result);
}
private int getNextIndicator() throws IOException{
if (mIndicator == 0) {
mIndicator = mInput.read();
}
return mIndicator;
}
/**
* 從輸入流讀取一個數據
*/
private int read() throws IOException {
int b = mInput.read();
if (b == -1)
throw new EOFException();
return b;
}
/**
* 根據指定長度, 從輸入流讀取字符數組
*/
private byte[] read(int length) throws IOException {
byte[] result = new byte[length];
int read = 0;
while (read < length)
{
int i = mInput.read(result, read, length - read);
if (i == -1)
throw new EOFException();
read += i;
}
return result;
}
}
其實這個解析類並不難, 邏輯的編寫主要根據bt文件格式來進行解析.
3.最後創建Torrent對象來對解析的文件信息進行分類整理
public class Torrent {
private final static String TAG = "Torrent";
private final Map<String, BtValue> mDecoded;
private final Map<String, BtValue> mDecoded_info;
private final HashSet<URI> mAllTrackers;
private final ArrayList<List<URI>> mTrackers;
private final Date mCreateDate;
private final String mComment;
private final String mCreatedBy;
private final String mName;
private final int mPieceLength;
private final LinkedList<TorrentFile> mFiles;
private final long mSize;
// 對應bt文件中包含多個文件, 定義TorrentFile類來表示每個文件,方便管理
public static class TorrentFile {
public final File file;
public final long size;
public TorrentFile(File file, long size) {
this.file = file;
this.size = size;
}
}
public Torrent(byte[] torrent) throws IOException {
mDecoded = BtParser.btDecode(
new ByteArrayInputStream(mEncoded)).getMap();
mDecoded_info = mDecoded.get("info").getMap();
try {
mAllTrackers = new HashSet<>();
mTrackers = new ArrayList<>();
// 解析獲得announce-list, 獲取tracker地址
if (mDecoded.containsKey("announce-list")) {
List<BtValue> tiers = mDecoded.get("announce-list").getList();
for (BtValue bv : tiers) {
List<BtValue> trackers = bv.getList();
if (trackers.isEmpty()) {
continue;
}
List<URI> tier = new ArrayList<>();
for (BtValue tracker : trackers) {
URI uri = new URI(tracker.getString());
if (!mAllTrackers.contains(uri)) {
tier.add(uri);
mAllTrackers.add(uri);
}
}
if (!tier.isEmpty()) {
mTrackers.add(tier);
}
}
} else if (mDecoded.containsKey("announce")) { // 對應單個tracker地址
URI tracker = new URI(mDecoded.get("announce").getString());
mAllTrackers.add(tracker);
List<URI> tier = new ArrayList<>();
tier.add(tracker);
mTrackers.add(tier);
}
} catch (URISyntaxException e) {
throw new IOException(e);
}
// 獲取文件創建日期
mCreateDate = mDecoded.containsKey("creation date") ?
new Date(mDecoded.get("creation date").getLong() * 1000)
: null;
// 獲取文件的comment
mComment = mDecoded.containsKey("comment")
? mDecoded.get("comment").getString()
: null;
// 獲取誰創建的文件
mCreatedBy = mDecoded.containsKey("created by")
? mDecoded.get("created by").getString()
: null;
// 獲取文件名字
mName = mDecoded_info.get("name").getString();
mPieceLength = mDecoded_info.get("piece length").getInt();
mFiles = new LinkedList<>();
// 解析多文件的信息結構
if (mDecoded_info.containsKey("files")) {
for (BtValue file : mDecoded_info.get("files").getList()) {
Map<String, BtValue> fileInfo = file.getMap();
StringBuilder path = new StringBuilder();
for (BtValue pathElement : fileInfo.get("path").getList()) {
path.append(File.separator)
.append(pathElement.getString());
}
mFiles.add(new TorrentFile(
new File(mName, path.toString()),
fileInfo.get("length").getLong()));
}
} else {
// 對於單文件的bt種子, bt文件的名字就是單文件的名字
mFiles.add(new TorrentFile(
new File(mName),
mDecoded_info.get("length").getLong()));
}
// 計算bt種子中所有文件的大小
long size = 0;
for (TorrentFile file : mFiles) {
size += file.size;
}
mSize = size;
// 下面就是單純的將bt種子文件解析的內容打印出來
String infoType = isMultiFile() ? "Multi" : "Single";
Log.i(TAG, "Torrent: file information: " + infoType);
Log.i(TAG, "Torrent: file name: " + mName);
Log.i(TAG, "Torrent: Announced at: " + (mTrackers.size() == 0 ? " Seems to be trackerless" : ""));
for (int i = 0; i < mTrackers.size(); ++i) {
List<URI> tier = mTrackers.get(i);
for (int j = 0; j < tier.size(); ++j) {
Log.i(TAG, "Torrent: {} " + (j == 0 ? String.format("%2d. ", i + 1) : " ")
+ tier.get(j));
}
}
if (mCreateDate != null) {
Log.i(TAG, "Torrent: createDate: " + mCreateDate);
}
if (mComment != null) {
Log.i(TAG, "Torrent: Comment: " + mComment);
}
if (mCreatedBy != null) {
Log.i(TAG, "Torrent: created by: " + mCreatedBy);
}
if (isMultiFile()) {
Log.i(TAG, "Found {} file(s) in multi-file torrent structure." + mFiles.size());
int i = 0;
for (TorrentFile file : mFiles) {
Log.i(TAG, "Torrent: file is " +
(String.format("%2d. path: %s size: %s", ++i, file.file.getPath(), file.size)));
}
}
long pieces = (mSize / mDecoded_info.get("piece length").getInt()) + 1;
Log.i(TAG, "Torrent: Pieces....: (byte(s)/piece" +
pieces + " " + mSize / mDecoded_info.get("piece length").getInt());
Log.i(TAG, "Torrent: Total size...: " + mSize);
}
/**
* 加載指定的種子文件, 將種子文件轉化爲Torrent對象
*/
public static Torrent load(File torrent) throws IOException {
byte[] data = readFileToByteArray(torrent);
return new Torrent(data);
}
public boolean isMultiFile() {
return mFiles.size() > 1;
}
/**
* 由file對象獲得byte[]對象
*/
private static byte[] readFileToByteArray(File file) {
byte[] buffer = null;
try {
FileInputStream fis = new FileInputStream(file);
ByteArrayOutputStream bos = new ByteArrayOutputStream(1024);
byte[] b = new byte[1024];
int n;
while ((n = fis.read(b)) != -1) {
bos.write(b, 0, n);
}
fis.close();
bos.close();
buffer = bos.toByteArray();
} catch (IOException e) {
e.printStackTrace();
}
return buffer;
}
}
4.使用方式
File fileTorrent = new File("file path");
try {
Torrent.load(fileTorrent);
} catch (IOException e) {
e.printStackTrace();
}
至此, 對BT種子文件的解析完成. 其實對種子文件的解析就是根據種子文件的編碼格式進行對應的解碼過程.