ID | Title | Difficulty | |
---|---|---|---|
Loading... |
535. Encode and Decode TinyURL
Problem
Note: This is a companion problem to the System Design problem: Design TinyURL.
TinyURL is a URL shortening service where you enter a URL such as https://leetcode.com/problems/design-tinyurl and it returns a short URL such as http://tinyurl.com/4e9iAk. Design a class to encode a URL and decode a tiny URL.
There is no restriction on how your encode/decode algorithm should work. You just need to ensure that a URL can be encoded to a tiny URL and the tiny URL can be decoded to the original URL.
Implement the Solution class:
- Solution() Initializes the object of the system.
- String encode(String longUrl) Returns a tiny URL for the given longUrl.
- String decode(String shortUrl) Returns the original long URL for the given shortUrl. It is guaranteed that the given shortUrl was encoded by the same object.
Example 1:
Input: url = "https://leetcode.com/problems/design-tinyurl"
Output: "https://leetcode.com/problems/design-tinyurl"
Explanation:
Solution obj = new Solution();
string tiny = obj.encode(url); // returns the encoded tiny url.
string ans = obj.decode(tiny); // returns the original url after deconding it.
Constraints:
- 1 <= url.length <= 10^4
- url is guaranteed to be a valid URL.
Code
方法 1:使用计数器
每次增加 1,放到 hashmap 中
- URL 的范围受到整数大小的限制,如果强制加入,会 overwrite
- 编码方式很容易被预测
public class Codec {
Map<Integer, String> map = new HashMap<>();
int i = 0;
public String encode(String longUrl) {
map.put(i, longUrl);
return "http://tinyurl.com/" + i++;
}
public String decode(String shortUrl) {
return map.get(Integer.parseInt(shortUrl.replace("http://tinyurl.com/", "")));
}
}
方法 2:变长编码
- URL 的数量仍然受到整数大小的限制,因为用到了 count
- 编码可预测
public class Codec {
String chars = "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ";
HashMap<String, String> map = new HashMap<>();
int count = 1;
public String getString() {
int c = count;
StringBuilder sb = new StringBuilder();
while (c > 0) {
c--;
sb.append(chars.charAt(c % 62));
c /= 62;
}
return sb.toString();
}
public String encode(String longUrl) {
String key = getString();
map.put(key, longUrl);
count++;
return "http://tinyurl.com/" + key;
}
public String decode(String shortUrl) {
return map.get(shortUrl.replace("http://tinyurl.com/", ""));
}
}
方法 3:使用 hashCode
- HashCode()并不能保证产生 unique 的编码,因此有可能重复
- 预测很困难
public class Codec {
Map<Integer, String> map = new HashMap<>();
public String encode(String longUrl) {
map.put(longUrl.hashCode(), longUrl);
return "http://tinyurl.com/" + longUrl.hashCode();
}
public String decode(String shortUrl) {
return map.get(Integer.parseInt(shortUrl.replace("http://tinyurl.com/", "")));
}
}
方法 4:随机数,如果重复了就再产生一个
- 无法预测
- 数量受限
public class Codec {
Map<Integer, String> map = new HashMap<>();
Random r = new Random();
int key = r.nextInt(Integer.MAX_VALUE);
public String encode(String longUrl) {
while (map.containsKey(key)) {
key = r.nextInt(Integer.MAX_VALUE);
}
map.put(key, longUrl);
return "http://tinyurl.com/" + key;
}
public String decode(String shortUrl) {
return map.get(Integer.parseInt(shortUrl.replace("http://tinyurl.com/", "")));
}
}
方法 5:比较好的方法, 固定长度, 随机字符
能产生很多的 url: (10 + 26 + 26)^6
- 降低了编码长度,固定为 6
- 很难重复
- 可以通过增加字符长度增加编码数量
- 无法预测
public class Codec {
String alphabet = "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ";
HashMap<String, String> map = new HashMap<>();
Random rand = new Random();
String key = getRand();
public String getRand() {
StringBuilder sb = new StringBuilder();
// 长度固定为6
for (int i = 0; i < 6; i++) {
// 随机字符对应
sb.append(alphabet.charAt(rand.nextInt(62)));
}
return sb.toString();
}
public String encode(String longUrl) {
while (map.containsKey(key)) {
key = getRand();
}
map.put(key, longUrl);
return "http://tinyurl.com/" + key;
}
public String decode(String shortUrl) {
return map.get(shortUrl.replace("http://tinyurl.com/", ""));
}
}
System Design - Design a URL Shortener
Step 1: Understand the problem and establish design scope
- Can you give me an example?
- Suppose the original URL is https://www.youtube.com/id=123456&channel=jiakaobo&loggedin=jiakaobo. Your service should create an alias with shorter length: https://www.youtube.com/123adrqev. If you visit it, it should redirect you to the original URL
- What is the traffic volume?
- 100 million URLs are generated pre day.
- How long is the shortened URL?
- As short as possible
- What characters are allowed in the shortened URL?
- Shortened URL can be 0-9, a-z, A-Z
- Can shortened URLs be deleted or updated?
- For simplicity, let’s assume shortened URLs cannot be deleted or updated
- What are the basic use cases?
- 1) URL shortening: given a long URL, return a shorter URL
- 2) URL redirecting: given a shorter URL, redirect to the original URL
- 3) High availability, scalability, and fault tolerance considerations
- Estimation
- 1) Write operation: 100 million URLs are generated per day
- 2) Write operation per second: 100 million / 24 / 3600 = 1160
- 3) Read operation: assuming ratio of read operation to write operation is 10:1, so read operation per second: 1160 * 10 = 11600
- 4) Assuming the URL shortener service will run for 10 years, this means we must support 100 million * 365 = 36.5 billion records.
- 5) Assume average URL length is 100.
- 6) Storage requirement over 10 years: 365 billion * 100 bytes * 10 years = 365 TB
Step 2: Propose high-level design and get buy-in
1. URL shortening - create new short URL
POST api/v1/data/shorten
{
longUrl: longUrlString
}
return shortUrl
2. URL redirecting - redirect short URL to long URL
GET api/v1/shortUrl
return longURL for HTTP redirection
Redirection messages 300 - 308
- 301: moved permanently, browser will remember the redirected URL instead of check server
- 302: moved temporarily (not recommended)
- 303: same as 302, but convert any requests to GET
- 307: same as 307, but keep HTTP methods, use case like submit form
Step 3: Design deep dive
1. Database
CREATE TABLE Url (
id int NOT NULL AUTO_INCREMENT,
short_url varchar(255) NOT NULL,
long_url varchar(255) NOT NULL,
PRIMARY KEY (id)
);
id|short_url|long_url
12|123adrqev|https://www.youtube.com/id=123456&channel=jiakaobo&loggedin=jiakaobo
2. Hash function
1) Length
365 billion, 62 characters => 62 ^ 6 = ~56 billion, 62 ^ 7 = ~3521 billion
2) Base 62 conversion
id: 11157 十进制 -> (2)(55)(59) 62进制 -> 2TX -> https://www.tynyurl.com/2TX
3) Logic
if(longUrl in DB) {
return shortUrl
} else {
id = generateId() // ID generator service - distributed system
shortUrl = generateShortUrl(id)
saveToDB(id, shortUrl, longUrl)
return shortUrl
}
3. Performance
1) Save short/long URL into cache, when users hit shortURL, check cache
2) Use right HTTP status code, like 301
Step 4: Wrap up
- Rate limit
- Database scaling: replication, sharding
- Availability, consistency, reliability …