HBase删除之后的读取和写入

1 HBase 删除操作

 

删除ColumnFamily

Delete delete = new Delete(rowKey);

delete.addFamily(columnFamily)

delete.setTimestamp(tm);

删除cf下, 小于或等于给定timestamp 的所有值; 若没有timestamp,使用most recent cell's timestamp;[这样代价比较大,需要先查询获取已经存在版本的最大时间]

 

删除Column

Delete delete = new Delete(rowKey);

delete.addColumns(columnFamily, column)

delete.setTimestamp(tm);

删除指定的column下, 小于或等于给定timestamp 的所有值; 若没有timestamp,使用most recent cell's timestamp;[这样代价比较大,需要先查询获取已经存在版本的最大时间]

 

删除某版本的值

Delete delete = new Delete(rowKey);

delete.addColumn(columnFamily, column, timestamp);

删除指定的column下, 等于给定timestamp 的值;

 

2 删除之后的读取和写入

 

删除ColumnFamily

delete.addFamily(columnFamily, ts)

删除column family后,<=ts的数据无法读出,也无法写入<=ts的数据。

 

删除Column

delete.addColumns(columnFamily, column, ts)

删除column后,<=ts的数据无法读出,也无法写入<=ts的数据。

 

删除某版本

delete.addColumn(columnFamily, column, ts);

删除某版本后,仅仅=ts的数据无法读出,也无法写入=ts的数据。

 

删除某版本的值后,该版本之前的值能读取得到吗?

/**
 * uuid-1, 1L;
 * uuid-2, 2L;
 *
 * cache --> uuid-2, 3L;
 *
 * del uuid-2, 2L;
 * put uuid-2, 3L;
 *
 * get all:
 * uuid-1 1L
 * uuid-2 3L
 *
 */
@Test
public void testLimitUpdate() throws IOException {
    // write into two uuid
    String member = "member-limit-4";
    String uuid1 = "uuid-1";
    String uuid2 = "uuid-2";
    Long uuid2Time = Long.valueOf(2L);

    Map<String, Long> map = new HashMap<>();
    map.put(uuid1, Long.valueOf(1L));
    map.put(uuid2, uuid2Time);

    for (Map.Entry<String, Long> entry: map.entrySet()) {
        putUuid(member, entry.getKey(), entry.getValue());
    }
    System.out.println("---> after put ");
    getLimit(member, HColumn.UUID.name());

    // delete uuid2 2L
    delLimit(member, uuid2Time);
    System.out.println("---> after del ");
    getLimit(member, HColumn.UUID.name());

    // update uuid2, timestamp = 3L
    putUuid(member, uuid2, Long.valueOf(3L));
    System.out.println("---> after update uuid2");
    getLimit(member, HColumn.UUID.name());

}

总结:删除某个版本的值,在该版本前的值依然能够读取得到,只是该版本的值没有了

 

删除某版本后写入

删除某版本后,写入该版本相等或之前的值,能写入吗?

@Test
public void testLimitUpdate() throws IOException {
    // write into two uuid
    String member = "member-limit-11";
    String uuid1 = "uuid-1";
    String uuid3 = "uuid-3";

    putUuid(member, uuid1, Long.valueOf(1L));
    putUuid(member, uuid3, Long.valueOf(3L));
    System.out.println("---> after put u1, u3");
    getLimit(member, HColumn.UUID.name());

    // delete 3L
    delLimit(member, Long.valueOf(3L));
    System.out.println("---> after del 3L");
    getLimit(member, HColumn.UUID.name());

    // put uuid2, timestamp = 3L, can't write
    putUuid(member, "uuid-2", Long.valueOf(3L));
    System.out.println("---> after put u2, 3L");
    getLimit(member, HColumn.UUID.name());

    // put uuid3, timestamp = 2L, write OK
    putUuid(member, "uuid-3", Long.valueOf(2L));
    System.out.println("---> after put u3, 2L");
    getLimit(member, HColumn.UUID.name());
}

 

---> after put u1, u3
get limit version: 100
rowkey => member-limit-11, columnFamily => cf, column => uuid, value => uuid-3, timestamp => 3
rowkey => member-limit-11, columnFamily => cf, column => uuid, value => uuid-1, timestamp => 1
---> after del 3L
get limit version: 100
rowkey => member-limit-11, columnFamily => cf, column => uuid, value => uuid-1, timestamp => 1
---> after put u2, 3L
get limit version: 100
rowkey => member-limit-11, columnFamily => cf, column => uuid, value => uuid-1, timestamp => 1
---> after put u3, 2L
get limit version: 100
rowkey => member-limit-11, columnFamily => cf, column => uuid, value => uuid-3, timestamp => 2
rowkey => member-limit-11, columnFamily => cf, column => uuid, value => uuid-1, timestamp => 1

删除某版本后,写入该timestamp相等的值,不能写入;写入不相等的timestamp的值,可以写入。

 

具体原因参考:

https://hbase.apache.org/book.html#_delete

https://hbase.apache.org/book.html#version.delete

 

之前碰到的诡异现象:

删除某个column,然后再向改column写入相同的数据,死活写不进去。一度怀疑是程序封装得有问题。

后来偶然原因去把表删除后,又可以写进去了。

 

原因分析:

HBase的Delete不是直接删除数据所对应的文件位置内容,而是一个标记删除动作。
即在删除的时候,加上一条类似<delete, cell, timestamp>的记录。在下一次major compact之前,这条delete记录跟真实的数据记录,比如<cell, timestamp1>,都存在于HBase的存储当中。
 
当读取的时候,<delete, cell, timestamp>跟<cell, timestamp1>都会被读取出来,以timestamp最大的最为最终返回个用户的结果(timestamp比delete这个小的,都被认为是删除掉的)。

 

原来HBase 删除是tomb stone方式的删除,给忘记了。

https://hbase.apache.org/book.html#_delete

https://hbase.apache.org/book.html#version.delete

 

 

附测试用到的公共方法:

public static void putUuid(String memberSrl, String uuid, long timestamp) {
    String rowKey = LIMIT_CONTROL.getRowKey(memberSrl);

    Put put = new Put(Bytes.toBytes(rowKey));
    byte[] value = Bytes.toBytes(uuid);
    put.addColumn(LIMIT_CONTROL.getCF(), HColumn.UUID.getCol(), timestamp, value);

    HBaseUtil.put(LIMIT_CONTROL, Arrays.asList(put));
}

public static void delLimit(String memberSrl, long timestamp) {
    String rowKey = LIMIT_CONTROL.getRowKey(memberSrl);

    Delete del = new Delete(Bytes.toBytes(rowKey));
    del.addColumn(LIMIT_CONTROL.getCF(), HColumn.UUID.getCol(), timestamp);

    List<Delete> deletes = new ArrayList<>();
    deletes.add(del);

    HBaseUtil.del(LIMIT_CONTROL, deletes);
}

public static void getLimit(String memberSrl, String trustType) throws IOException {
    String rowKey = LIMIT_CONTROL.getRowKey(memberSrl);
    Get get = new Get(Bytes.toBytes(rowKey));
    get.addColumn(LIMIT_CONTROL.getCF(), HColumn.valueOf(trustType).getCol());

    int limit = DurationLimit.getLimit(HColumn.UUID.name());
    System.out.println("get limit version: " + limit);
    //!!! set version is important for limit control
    get.setMaxVersions(limit);

    Result[] results = HBaseUtil.get(LIMIT_CONTROL, get);
    for (Result result: results) {
        CellUtil.displayResult(result, String.class);
    }
}

 

附HBase client 1.2 Delete 代码:

/**
 * Used to perform Delete operations on a single row.
 * <p>
 * To delete an entire row, instantiate a Delete object with the row
 * to delete.  To further define the scope of what to delete, perform
 * additional methods as outlined below.
 * <p>
 * To delete specific families, execute {@link #addFamily(byte[]) deleteFamily}
 * for each family to delete.
 * <p>
 * To delete multiple versions of specific columns, execute
 * {@link #addColumns(byte[], byte[]) deleteColumns}
 * for each column to delete.
 * <p>
 * To delete specific versions of specific columns, execute
 * {@link #addColumn(byte[], byte[], long) deleteColumn}
 * for each column version to delete.
 * <p>
 * Specifying timestamps, deleteFamily and deleteColumns will delete all
 * versions with a timestamp less than or equal to that passed.  If no
 * timestamp is specified, an entry is added with a timestamp of 'now'
 * where 'now' is the servers's System.currentTimeMillis().
 * Specifying a timestamp to the deleteColumn method will
 * delete versions only with a timestamp equal to that specified.
 * If no timestamp is passed to deleteColumn, internally, it figures the
 * most recent cell's timestamp and adds a delete at that timestamp; i.e.
 * it deletes the most recently added cell.
 * <p>The timestamp passed to the constructor is used ONLY for delete of
 * rows.  For anything less -- a deleteColumn, deleteColumns or
 * deleteFamily -- then you need to use the method overrides that take a
 * timestamp.  The constructor timestamp is not referenced.
 */
@InterfaceAudience.Public
@InterfaceStability.Stable
public class Delete extends Mutation implements Comparable<Row> {
  /**
   * Create a Delete operation for the specified row.
   * <p>
   * If no further operations are done, this will delete everything
   * associated with the specified row (all versions of all columns in all
   * families).
   * @param row row key
   */
  public Delete(byte [] row) {
    this(row, HConstants.LATEST_TIMESTAMP);
  }

  /**
   * Create a Delete operation for the specified row and timestamp.<p>
   *
   * If no further operations are done, this will delete all columns in all
   * families of the specified row with a timestamp less than or equal to the
   * specified timestamp.<p>
   *
   * This timestamp is ONLY used for a delete row operation.  If specifying
   * families or columns, you must specify each timestamp individually.
   * @param row row key
   * @param timestamp maximum version timestamp (only for delete row)
   */
  public Delete(byte [] row, long timestamp) {
    this(row, 0, row.length, timestamp);
  }

  /**
   * Create a Delete operation for the specified row and timestamp.<p>
   *
   * If no further operations are done, this will delete all columns in all
   * families of the specified row with a timestamp less than or equal to the
   * specified timestamp.<p>
   *
   * This timestamp is ONLY used for a delete row operation.  If specifying
   * families or columns, you must specify each timestamp individually.
   * @param rowArray We make a local copy of this passed in row.
   * @param rowOffset
   * @param rowLength
   */
  public Delete(final byte [] rowArray, final int rowOffset, final int rowLength) {
    this(rowArray, rowOffset, rowLength, HConstants.LATEST_TIMESTAMP);
  }

  /**
   * Create a Delete operation for the specified row and timestamp.<p>
   *
   * If no further operations are done, this will delete all columns in all
   * families of the specified row with a timestamp less than or equal to the
   * specified timestamp.<p>
   *
   * This timestamp is ONLY used for a delete row operation.  If specifying
   * families or columns, you must specify each timestamp individually.
   * @param rowArray We make a local copy of this passed in row.
   * @param rowOffset
   * @param rowLength
   * @param ts maximum version timestamp (only for delete row)
   */
  public Delete(final byte [] rowArray, final int rowOffset, final int rowLength, long ts) {
    checkRow(rowArray, rowOffset, rowLength);
    this.row = Bytes.copy(rowArray, rowOffset, rowLength);
    setTimestamp(ts);
  }

  /**
   * @param d Delete to clone.
   */
  public Delete(final Delete d) {
    this.row = d.getRow();
    this.ts = d.getTimeStamp();
    this.familyMap.putAll(d.getFamilyCellMap());
    this.durability = d.durability;
    for (Map.Entry<String, byte[]> entry : d.getAttributesMap().entrySet()) {
      this.setAttribute(entry.getKey(), entry.getValue());
    }
  }

  /**
   * Advanced use only.
   * Add an existing delete marker to this Delete object.
   * @param kv An existing KeyValue of type "delete".
   * @return this for invocation chaining
   * @throws IOException
   */
  @SuppressWarnings("unchecked")
  public Delete addDeleteMarker(Cell kv) throws IOException {
    // TODO: Deprecate and rename 'add' so it matches how we add KVs to Puts.
    if (!CellUtil.isDelete(kv)) {
      throw new IOException("The recently added KeyValue is not of type "
          + "delete. Rowkey: " + Bytes.toStringBinary(this.row));
    }
    if (Bytes.compareTo(this.row, 0, row.length, kv.getRowArray(),
        kv.getRowOffset(), kv.getRowLength()) != 0) {
      throw new WrongRowIOException("The row in " + kv.toString() +
        " doesn't match the original one " +  Bytes.toStringBinary(this.row));
    }
    byte [] family = CellUtil.cloneFamily(kv);
    List<Cell> list = familyMap.get(family);
    if (list == null) {
      list = new ArrayList<Cell>();
    }
    list.add(kv);
    familyMap.put(family, list);
    return this;
  }

  /**
   * Delete all versions of all columns of the specified family.
   * <p>
   * Overrides previous calls to deleteColumn and deleteColumns for the
   * specified family.
   * @param family family name
   * @return this for invocation chaining
   * @deprecated Since 1.0.0. Use {@link #addFamily(byte[])}
   */
  @Deprecated
  public Delete deleteFamily(byte [] family) {
    return addFamily(family);
  }

  /**
   * Delete all versions of all columns of the specified family.
   * <p>
   * Overrides previous calls to deleteColumn and deleteColumns for the
   * specified family.
   * @param family family name
   * @return this for invocation chaining
   */
  public Delete addFamily(final byte [] family) {
    this.deleteFamily(family, this.ts);
    return this;
  }

  /**
   * Delete all columns of the specified family with a timestamp less than
   * or equal to the specified timestamp.
   * <p>
   * Overrides previous calls to deleteColumn and deleteColumns for the
   * specified family.
   * @param family family name
   * @param timestamp maximum version timestamp
   * @return this for invocation chaining
   * @deprecated Since 1.0.0. Use {@link #addFamily(byte[], long)}
   */
  @Deprecated
  public Delete deleteFamily(byte [] family, long timestamp) {
    return addFamily(family, timestamp);
  }

  /**
   * Delete all columns of the specified family with a timestamp less than
   * or equal to the specified timestamp.
   * <p>
   * Overrides previous calls to deleteColumn and deleteColumns for the
   * specified family.
   * @param family family name
   * @param timestamp maximum version timestamp
   * @return this for invocation chaining
   */
  public Delete addFamily(final byte [] family, final long timestamp) {
    if (timestamp < 0) {
      throw new IllegalArgumentException("Timestamp cannot be negative. ts=" + timestamp);
    }
    List<Cell> list = familyMap.get(family);
    if(list == null) {
      list = new ArrayList<Cell>();
    } else if(!list.isEmpty()) {
      list.clear();
    }
    KeyValue kv = new KeyValue(row, family, null, timestamp, KeyValue.Type.DeleteFamily);
    list.add(kv);
    familyMap.put(family, list);
    return this;
  }

  /**
   * Delete all columns of the specified family with a timestamp equal to
   * the specified timestamp.
   * @param family family name
   * @param timestamp version timestamp
   * @return this for invocation chaining
   * @deprecated Since hbase-1.0.0. Use {@link #addFamilyVersion(byte[], long)}
   */
  @Deprecated
  public Delete deleteFamilyVersion(byte [] family, long timestamp) {
    return addFamilyVersion(family, timestamp);
  }

  /**
   * Delete all columns of the specified family with a timestamp equal to
   * the specified timestamp.
   * @param family family name
   * @param timestamp version timestamp
   * @return this for invocation chaining
   */
  public Delete addFamilyVersion(final byte [] family, final long timestamp) {
    List<Cell> list = familyMap.get(family);
    if(list == null) {
      list = new ArrayList<Cell>();
    }
    list.add(new KeyValue(row, family, null, timestamp,
          KeyValue.Type.DeleteFamilyVersion));
    familyMap.put(family, list);
    return this;
  }

  /**
   * Delete all versions of the specified column.
   * @param family family name
   * @param qualifier column qualifier
   * @return this for invocation chaining
   * @deprecated Since hbase-1.0.0. Use {@link #addColumns(byte[], byte[])}
   */
  @Deprecated
  public Delete deleteColumns(byte [] family, byte [] qualifier) {
    return addColumns(family, qualifier);
  }

  /**
   * Delete all versions of the specified column.
   * @param family family name
   * @param qualifier column qualifier
   * @return this for invocation chaining
   */
  public Delete addColumns(final byte [] family, final byte [] qualifier) {
    addColumns(family, qualifier, this.ts);
    return this;
  }

  /**
   * Delete all versions of the specified column with a timestamp less than
   * or equal to the specified timestamp.
   * @param family family name
   * @param qualifier column qualifier
   * @param timestamp maximum version timestamp
   * @return this for invocation chaining
   * @deprecated Since hbase-1.0.0. Use {@link #addColumns(byte[], byte[], long)}
   */
  @Deprecated
  public Delete deleteColumns(byte [] family, byte [] qualifier, long timestamp) {
    return addColumns(family, qualifier, timestamp);
  }

  /**
   * Delete all versions of the specified column with a timestamp less than
   * or equal to the specified timestamp.
   * @param family family name
   * @param qualifier column qualifier
   * @param timestamp maximum version timestamp
   * @return this for invocation chaining
   */
  public Delete addColumns(final byte [] family, final byte [] qualifier, final long timestamp) {
    if (timestamp < 0) {
      throw new IllegalArgumentException("Timestamp cannot be negative. ts=" + timestamp);
    }
    List<Cell> list = familyMap.get(family);
    if (list == null) {
      list = new ArrayList<Cell>();
    }
    list.add(new KeyValue(this.row, family, qualifier, timestamp,
        KeyValue.Type.DeleteColumn));
    familyMap.put(family, list);
    return this;
  }

  /**
   * Delete the latest version of the specified column.
   * This is an expensive call in that on the server-side, it first does a
   * get to find the latest versions timestamp.  Then it adds a delete using
   * the fetched cells timestamp.
   * @param family family name
   * @param qualifier column qualifier
   * @return this for invocation chaining
   * @deprecated Since hbase-1.0.0. Use {@link #addColumn(byte[], byte[])}
   */
  @Deprecated
  public Delete deleteColumn(byte [] family, byte [] qualifier) {
    return addColumn(family, qualifier);
  }

  /**
   * Delete the latest version of the specified column.
   * This is an expensive call in that on the server-side, it first does a
   * get to find the latest versions timestamp.  Then it adds a delete using
   * the fetched cells timestamp.
   * @param family family name
   * @param qualifier column qualifier
   * @return this for invocation chaining
   */
  public Delete addColumn(final byte [] family, final byte [] qualifier) {
    this.deleteColumn(family, qualifier, this.ts);
    return this;
  }

  /**
   * Delete the specified version of the specified column.
   * @param family family name
   * @param qualifier column qualifier
   * @param timestamp version timestamp
   * @return this for invocation chaining
   * @deprecated Since hbase-1.0.0. Use {@link #addColumn(byte[], byte[], long)}
   */
  @Deprecated
  public Delete deleteColumn(byte [] family, byte [] qualifier, long timestamp) {
    return addColumn(family, qualifier, timestamp);
  }

  /**
   * Delete the specified version of the specified column.
   * @param family family name
   * @param qualifier column qualifier
   * @param timestamp version timestamp
   * @return this for invocation chaining
   */
  public Delete addColumn(byte [] family, byte [] qualifier, long timestamp) {
    if (timestamp < 0) {
      throw new IllegalArgumentException("Timestamp cannot be negative. ts=" + timestamp);
    }
    List<Cell> list = familyMap.get(family);
    if(list == null) {
      list = new ArrayList<Cell>();
    }
    KeyValue kv = new KeyValue(this.row, family, qualifier, timestamp, KeyValue.Type.Delete);
    list.add(kv);
    familyMap.put(family, list);
    return this;
  }

  /**
   * Set the timestamp of the delete.
   *
   * @param timestamp
   */
  public void setTimestamp(long timestamp) {
    if (timestamp < 0) {
      throw new IllegalArgumentException("Timestamp cannot be negative. ts=" + timestamp);
    }
    this.ts = timestamp;
  }

  @Override
  public Map<String, Object> toMap(int maxCols) {
    // we start with the fingerprint map and build on top of it.
    Map<String, Object> map = super.toMap(maxCols);
    // why is put not doing this?
    map.put("ts", this.ts);
    return map;
  }
}
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章