使用 UrlQuerySanitizer 来处理 url

网上对于 UrlQuerySanitizer 的资料比较少,这个是 Android 提供的一个用来处理 url 的 API。由于项目的需要,需要对 url 的 query 参数进行排序,因此需要解析 url 并处理 query 参数。

最初的方法是使用 Uri:

1
2
3
4
5
6
7
8
9
10
public void parseUrl(String url) {
Uri uri = Uri.parse(url);
Set<String> query = uri.getQueryParameterNames();
if (!query.isEmpty()) {
TreeSet<String> treeQuery = new TreeSet<>(query);
for (String key : treeQuery) {
String value = uri.getQueryParameter(key);
}
}
}

通过这样的方式就可以解析 url,并获取到各个 query 参数。但后来发现 Uri 不能处理一些特殊字符,比如#,Uri 会截断#以后的内容,这样就不能满足开发需求。经过各种 google,最后发现了一个 UrlQuerySanitizer 的 API:

1
2
3
4
5
6
7
8
9
10
11
12
13
public void parseUrl(String url) {
UrlQuerySanitizer sanitizer = new UrlQuerySanitizer();
sanitizer.setAllowUnregisteredParamaters(true);
sanitizer.setUnregisteredParameterValueSanitizer(UrlQuerySanitizer.getAllButNulLegal());
sanitizer.parseUrl(url);
final Set<String> query = sanitizer.getParameterSet();
if (!query.isEmpty()) {
TreeSet<String> treeQuery = new TreeSet<>(query);
for (String key : treeQuery) {
String value = sanitizer.getValue(key);
}
}
}

首先要使用 setAllowUnregisteredParamaters 让其支持特殊字符,然后使用 setUnregisteredParameterValueSanitizer 来设置支持哪些特殊字符,UrlQuerySanitizer 提供了集中默认的 ValueSanitizer:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
/**
* Return a value sanitizer that does not allow any special characters,
* and also does not allow script URLs.
* @return a value sanitizer
*/
public static final ValueSanitizer getAllIllegal() {
return sAllIllegal;
}

/**
* Return a value sanitizer that allows everything except Nul ('\0')
* characters. Script URLs are allowed.
* @return a value sanitizer
*/
public static final ValueSanitizer getAllButNulLegal() {
return sAllButNulLegal;
}
/**
* Return a value sanitizer that allows everything except Nul ('\0')
* characters, space (' '), and other whitespace characters.
* Script URLs are allowed.
* @return a value sanitizer
*/
public static final ValueSanitizer getAllButWhitespaceLegal() {
return sAllButWhitespaceLegal;
}
/**
* Return a value sanitizer that allows all the characters used by
* encoded URLs. Does not allow script URLs.
* @return a value sanitizer
*/
public static final ValueSanitizer getUrlLegal() {
return sURLLegal;
}
/**
* Return a value sanitizer that allows all the characters used by
* encoded URLs and allows spaces, which are not technically legal
* in encoded URLs, but commonly appear anyway.
* Does not allow script URLs.
* @return a value sanitizer
*/
public static final ValueSanitizer getUrlAndSpaceLegal() {
return sUrlAndSpaceLegal;
}
/**
* Return a value sanitizer that does not allow any special characters
* except ampersand ('&'). Does not allow script URLs.
* @return a value sanitizer
*/
public static final ValueSanitizer getAmpLegal() {
return sAmpLegal;
}
/**
* Return a value sanitizer that does not allow any special characters
* except ampersand ('&') and space (' '). Does not allow script URLs.
* @return a value sanitizer
*/
public static final ValueSanitizer getAmpAndSpaceLegal() {
return sAmpAndSpaceLegal;
}
/**
* Return a value sanitizer that does not allow any special characters
* except space (' '). Does not allow script URLs.
* @return a value sanitizer
*/
public static final ValueSanitizer getSpaceLegal() {
return sSpaceLegal;
}
/**
* Return a value sanitizer that allows any special characters
* except angle brackets ('<' and '>') and Nul ('\0').
* Allows script URLs.
* @return a value sanitizer
*/
public static final ValueSanitizer getAllButNulAndAngleBracketsLegal() {
return sAllButNulAndAngleBracketsLegal;
}

每种 ValueSanitizer 都对应过滤哪些字符,被过滤掉的特殊字符会被替换成_或者空格。
如果默认的 ValueSanitizer 不能满足开发需求,还可以自己构造 ValueSanitizer:

1
2
3
4
5
6
public void parseUrl(String url) {
.....
ValueSanitizer sanitizer = new UrlQuerySanitizer.IllegalCharacterValueSanitizer(UrlQuerySanitizer.IllegalCharacterValueSanitizer.ALL_OK);
setUnregisteredParameterValueSanitizer(sanitizer);
.....
}

UrlQuerySanitizer 也可以通过 key 来获取相应的 value,比如给一个 url:http://coolerfall.com?name=vincent:

1
2
3
4
5
6
7
public void parseUrl(String url) {
UrlQuerySanitizer sanitizer = new UrlQuerySanitizer();
sanitizer.setAllowUnregisteredParamaters(true);
sanitizer.setUnregisteredParameterValueSanitizer(UrlQuerySanitizer.getAllButNulLegal());
sanitizer.parseUrl(url);
String name = sanitizer.getValue("name");
}

UrlQuerySanitizer 还可以只解析 query 参数,比如:name=vincent&article=first:

1
2
3
4
5
6
7
8
public void parseUrl(String query) {
UrlQuerySanitizer sanitizer = new UrlQuerySanitizer();
sanitizer.setAllowUnregisteredParamaters(true);
sanitizer.setUnregisteredParameterValueSanitizer(UrlQuerySanitizer.getAllButNulLegal());
sanitizer.parseQuery(query);
String name = sanitizer.getValue("name");
.....
}

以上就是 UrlQuerySanitizer 大致用法,用来解析处理 url 非常的方便。