阿里真题:千万级数据分页优化
面试重要程度:⭐⭐⭐⭐⭐
真题来源:阿里巴巴2024校招技术面试
考察重点:大数据量分页、SQL优化、系统架构设计
预计阅读时间:40分钟
真题背景
面试官: "我们有一个用户表,数据量达到了2000万条记录。现在需要实现分页查询功能,但是用户反馈翻页到后面几页时非常慢,有时候甚至超时。你能分析一下原因,并提供几种优化方案吗?请详细说明每种方案的实现和适用场景。"
考察意图:
- 对大数据量分页问题的深度理解
- SQL优化和数据库原理掌握程度
- 系统架构设计和权衡能力
- 实际项目经验和解决问题的思路
🎯 问题分析
传统分页的性能问题
问题根源:
-- 传统LIMIT分页查询 SELECT * FROM users ORDER BY id LIMIT 1000000, 20; -- 执行计划分析 EXPLAIN SELECT * FROM users ORDER BY id LIMIT 1000000, 20; /** * 性能问题分析: * 1. MySQL需要扫描前1000020条记录 * 2. 然后丢弃前1000000条,只返回20条 * 3. 随着offset增大,扫描的数据量线性增长 * 4. 大量的磁盘I/O和内存消耗 * 5. 排序操作的开销随数据量增加 */
性能测试对比:
/**
* 分页性能测试
*/
@Component
public class PaginationPerformanceTest {
@Autowired
private UserMapper userMapper;
/**
* 测试不同offset的查询性能
*/
public void testPaginationPerformance() {
int pageSize = 20;
int[] offsets = {0, 1000, 10000, 100000, 500000, 1000000};
for (int offset : offsets) {
long startTime = System.currentTimeMillis();
List<User> users = userMapper.selectList(
new QueryWrapper<User>()
.orderByAsc("id")
.last("LIMIT " + offset + ", " + pageSize)
);
long endTime = System.currentTimeMillis();
long duration = endTime - startTime;
System.out.printf("Offset: %d, Duration: %dms, Records: %d%n",
offset, duration, users.size());
}
/**
* 典型测试结果:
* Offset: 0, Duration: 5ms, Records: 20
* Offset: 1000, Duration: 8ms, Records: 20
* Offset: 10000, Duration: 25ms, Records: 20
* Offset: 100000, Duration: 180ms, Records: 20
* Offset: 500000, Duration: 850ms, Records: 20
* Offset: 1000000, Duration: 2100ms, Records: 20
*/
}
}
🚀 优化方案详解
方案1:游标分页(推荐)
核心思想: 使用上一页的最后一条记录作为查询起点,避免使用OFFSET。
实现方案:
/**
* 游标分页实现
*/
@Data
public class CursorPageRequest {
private Long cursor; // 游标值(上一页最后一条记录的ID)
private Integer pageSize = 20;
private String direction = "next"; // next/prev
}
@Data
public class CursorPageResponse<T> {
private List<T> data;
private Long nextCursor; // 下一页游标
private Long prevCursor; // 上一页游标
private Boolean hasNext;
private Boolean hasPrev;
}
@Service
public class CursorPaginationService {
@Autowired
private UserMapper userMapper;
/**
* 游标分页查询
*/
public CursorPageResponse<User> getUsersByCursor(CursorPageRequest request) {
List<User> users;
if (request.getCursor() == null) {
// 第一页查询
users = userMapper.selectList(
new QueryWrapper<User>()
.orderByAsc("id")
.last("LIMIT " + (request.getPageSize() + 1))
);
} else if ("next".equals(request.getDirection())) {
// 下一页查询
users = userMapper.selectList(
new QueryWrapper<User>()
.gt("id", request.getCursor())
.orderByAsc("id")
.last("LIMIT " + (request.getPageSize() + 1))
);
} else {
// 上一页查询
users = userMapper.selectList(
new QueryWrapper<User>()
.lt("id", request.getCursor())
.orderByDesc("id")
.last("LIMIT " + (request.getPageSize() + 1))
);
// 需要反转结果顺序
Collections.reverse(users);
}
return buildCursorResponse(users, request.getPageSize());
}
private CursorPageResponse<User> buildCursorResponse(List<User> users, int pageSize) {
CursorPageResponse<User> response = new CursorPageResponse<>();
boolean hasNext = users.size() > pageSize;
if (hasNext) {
users = users.subList(0, pageSize);
}
response.setData(users);
response.setHasNext(hasNext);
response.setHasPrev(users.size() > 0 && users.get(0).getId() > 1);
if (!users.isEmpty()) {
response.setNextCursor(users.get(users.size() - 1).getId());
response.setPrevCursor(users.get(0).getId());
}
return response;
}
}
前端实现:
/**
* 前端游标分页组件
*/
class CursorPagination {
constructor() {
this.currentCursor = null;
this.cursorStack = []; // 存储游标历史,支持返回上一页
this.pageSize = 20;
}
// 加载第一页
async loadFirstPage() {
const response = await this.fetchData({
cursor: null,
pageSize: this.pageSize,
direction: 'next'
});
this.currentCursor = response.nextCursor;
this.cursorStack = [null]; // 重置游标栈
return response;
}
// 加载下一页
async loadNextPage() {
if (!this.hasNext()) return null;
this.cursorStack.push(this.currentCursor);
const response = await this.fetchData({
cursor: this.currentCursor,
pageSize: this.pageSize,
direction: 'next'
});
this.currentCursor = response.nextCursor;
return response;
}
// 加载上一页
async loadPrevPage() {
if (this.cursorStack.length <= 1) return null;
this.cursorStack.pop(); // 移除当前游标
const prevCursor = this.cursorStack[this.cursorStack.length - 1];
const response = await this.fetchData({
cursor: prevCursor,
pageSize: this.pageSize,
direction: 'next'
});
this.currentCursor = response.nextCursor;
return response;
}
async fetchData(params) {
const response = await fetch('/api/users/cursor', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify(params)
});
return response.json();
}
}
方案2:延迟关联优化
核心思想: 先通过索引获取主键ID,再关联查询完整数据。
实现方案:
/**
* 延迟关联分页实现
*/
@Service
public class DelayedJoinPaginationService {
@Autowired
private UserMapper userMapper;
/**
* 延迟关联分页查询
*/
public PageResponse<User> getUsersByDelayedJoin(int page, int pageSize) {
int offset = (page - 1) * pageSize;
// 第一步:通过覆盖索引快速获取ID列表
List<Long> userIds = userMapper.selectIdsByPage(offset, pageSize);
if (userIds.isEmpty()) {
return new PageResponse<>(Collections.emptyList(), 0, page, pageSize);
}
// 第二步:根据ID列表查询完整用户信息
List<User> users = userMapper.selectBatchIds(userIds);
剩余60%内容,订阅专栏后可继续查看/也可单篇购买
Java面试圣经 文章被收录于专栏
Java面试圣经,带你练透java圣经
